DAD´s Arthur Flexer gave a virtual lecture on our plans for the DAD exhibition opening early summer at the Austrian Museum of Folk Life and Folk Art. The lecture was given at the Austrian Research Institute for Artificial Intelligence (OFAI), where DAD was also located during the first nine month. This last CRITICAL SPACE was conducted to get feedback about our plans to build exhibits documenting our engagement with the collections of the Glyptothek of the Academy of Fine Arts Vienna, the Volkskundemuseum Wien and the Belvedere. Results for these three case studies differ according to the level of interaction between curators and machine: from using natural language processing tools for research in museum databases to a symbiotic interaction between curators and algorithms to robots visiting museums autonomously.
The ensuing conversation centered around ways to include and document aspects of algorithmic bias and societal stereotypes existing in natural language models. We also discussed our approaches to turn digital findings into analog exhibits and how using a robot to explore the Glyptothek aligns with the public’s (mis)conception that AI is predominantly about building machines not software.
Our initial intention was to review and audit the current development of all our approaches after 18 months of the project‘s time span in a (semi-public) symposium-esque format. Due to COVID-19 this CONCLUSIVE SPACE was a somewhat smaller affair in the form of a one day workshop (29.1.2021) including the whole DAD team with all COVID precautions taken.
We planned the remaining half year of the project which will be used for a final documentation of the assembled approaches, of the models, code and curatorial developments. This CONSTRUCTIVE SPACE will also finalize hands-on displays and physical proof of concepts for use in the final PRESENTATION SPACE.
This PRESENTATION SPACE will be a comprehensive exhibition documenting and presenting all achievements and works-in-progress at the Austrian Museum of Folk Life and Folk Art. Please watch this space for a making-of and behind-the-scenes documentation of this process, as well as announcement of the exhibition which will open beginning of summer.
DAD’s Arthur Flexer presented our work on analysing the semantic meaning of works of art at the International online conference The Art Museum in the Digital Age of the Belvedere Research Center. This conference is concerned with the digital transformation of art museums, which seems even more relevant lately because of COVID-19 related lockdowns and closures.
Arthur presented our (somewhat radical) approach to analyse text about artworks rather than the usual route of analysing images of the artworks. We chose this semantic driven approach because a lot of information about an artwork cannot be found in the artwork itself. Think e.g. of subjecting the “Mona Lisa” to an automatic visual analysis. Computational results will tell you that it is a picture of a young woman, in front of a landscape, and (if your algorithm is really good) is sort of smiling. This information of course totally misses the significance of the painting for (Western) art history, its immense relevance and the many connotations it has. All of this rather is a societal construct and result of centuries of discourse and reception history (for more on this see our previous blogpost). Our semantic driven approach  towards the collection of the Belvedere enables us to discover X degrees of keyword separation between works of art.
This is achieved by using the technique of word embedding [Mikolov et al 2013], which encodes semantic similarities between words by modelling the context to their neighboring words in a large training text corpus. This was used to embed keywords of Belvedere´s online fine arts collection and obtain pathways through the resulting semantic space.
The above result starts with a painting having keywords ’Clouds’, ’Mountain’, ’Meadow’ from which we transit to ’Mountain’, ’Lake’, ’Alps’ and ’Austria’, next to a painting tagged ’Fog’, then one with ’Rocky coast’ and finally with ’Clouds’, ’Rocky Coast’, ’Sea’. Our pathway therefore smoothly transits from a mountain setting to a lake in the mountains to the sea.
We also presented one very concrete solution for a room in Belvedere’s permanent exhibition. It is a room about “Viennese portraiture in the Biedermeier period”, assembling the “greatest portrait painters” from this period. In the above picture you can see four blue frames which indicate empty slots which we like to fill using our algorithm with the respective neighboring artworks as input.
The keywords for these neighboring artworks however are purely descriptive, e.g. ‘headgear’, ‘necklace’, ‘bonnet’, ‘eye contact’, probably not doing the semantic content of the artworks full justice. We believe that one underlying topic of the Biedermeier room is ‘gender’, with all but one painting depicting females. We therefore add an additional algorithmic constraint by requiring all suggested artworks to respects both the requirement of being part of a pathway and having a ‘gender’ related keyword. Since ‘gender’ is not a keyword in the Belvedere taxonomy, we use word embedding to obtain Belvedere keywords with high similarity to the topic of ‘gender’. This translation step yields keywords like: ‘femaleness’, ‘religion’, ‘islam’, ‘equality’, ‘motherhood’ or ‘headscarf’. It is obvious that these keywords point to a stereotypical discourse of gender, quickly derailing towards topics of religion and a compulsion to wear headscarfs or women being predominantly seen in their role as mothers.
This is also why we termed the use of word embedding in this context world embedding: it confronts the very rigid taxonomy of the Belvedere keywords (based on Iconclass, a classification system for cultural content) with everyday language as represented in the textual training data of the word embedding. It thereby recontextualizes or even “resocializes” taxonomic art histories via natural language processing since it uncovers biases and prejudice in our use of language and (re-?) introduces them to the world of fine arts.
The above picture shows three paintings from the Biedermeier room plus four additional paintings (with red frames) which our algorithm suggests. The second painting from the left is suggested because its keyword ‘femaleness’ is a gender keyword and its keyword ‘necklace’ makes it similar to the keywords of the first painting (‘earrings’, ‘pearl necklace’) and the one in the middle (‘brooch’, ‘bracelet’). The 5th painting from the left is suggested because ‘headscarf’ is a gender keyword and ‘eye contact’ and ‘earring’ make it similar to both the painting in the middle (‘brooch’, ‘bracelet’, ‘eye contact’) and the painting on the far right (‘eye contact’, ‘bonnet’).
In the ensuing discussion with the conference’s audience Arthur Flexer advocated that our semantic apprach is more helpful for building a curatorial narrative than a purely aesthetic procedure. It allows to answer the question about curatorial gaps between artworks shown in an exhibition. What works of art exist in the holdings of the museum that fit the curatorial narrative but did not succeed in becoming part of the exhibition?
He also tried to make clear that by using such a machine learning tool like word embedding, curating becomes a joint endeavor of man and machine, where curatorial decisions have to be formulated as input and constraints to the algorithm. But even a simple curatorial Google search already is an interaction of man and machine, with algorithms (oblique to the curator) nevertheless to a certain extent shaping their curatorial enterprise by showing specific selections of information only. It was also discussed that such a man/machine approach is able to uncover algorithmic biases in the methods used, as e.g. stereotypical representations of societal discourse in word embedding.
Looking towards future extensions of our work it can be said that of course we could analyse longer (art historic) texts about artworks with the same methodology thereby gaining much richer semantic context then by relying on simple keywords only. Another possible extension is to embed semantic and visual information simultaneously which could yield curatorial solutions that respect semantic and viusal constraints at the same time [Frome et al 2013].
DAD’s Arthur Flexer engaged in an online meeting with Dr. Sandra Manninger (SPAN, Tsinghua SIGS, IAAC) and Dr. Matias del Campo (SPAN, Taubman College of Architecture and Urban Planning, University of Michigan) discussing “Data, Dust, Art & Techno”. Sandra Manninger and Matias del Campo are working in the intersection of Artificial Intelligence, Machine Learning and Architecture. Our communication centered on the opportunities as well as possible pitfalls of applying AI to the arts. Please enjoy what is the first in a series of “AI chats” by Manninger and del Campo.
This includes a model of a weaving loom programmable by punch cards or a transparent model of the human brain.
There are also interactive stations e.g. trying to guess your emotional status via camera, which is appearantly not easy with visitors wearing masks due to COVID-19. You can also try to teach a dog/cat classifier by showing it respective pictures.
The top floor is dedicated to the intersection of AI and the Arts, concentrating on music, literature and visual arts.
The DAD team will talk about “WOR(L)D EMBEDDING – Curating/Computing/Displaying Semantic Pathways through Belvedere’s Online Collection” at the annual conference of the Belvedere Research Center. This conference is concerned with the digital transformation of art museums. The dates are from 11th to 15th of January 2021 and the format is an online meeting. You can find out about the program and how to register for free at the respective website.
When applying machine learning to fine art paintings, one obvious approach is to analyse the visual content of the paintings. We discuss two major problems which caused us to take a semantic route instead: (i) state-of-the-art image analysis has been trained on photos and does not work well with paintings; (ii) visual information obtained from paintings is not sufficient for building a curatorial narrative.
Let us start by using the DenseCap online tool to automatically compute captions for a photo of two dogs playing.
The DenseCap model correctly identifies the dogs and many of their properties (e.g. “the dog is brown”, “the dog has brown eyes”, “the front legs of a dog”, “the ear|head|paw of a dog”) as well as aspects of the backgound (“a piece of grass”, “a leaf on the ground”). There are some wrong captions for the dogs (“the dogs tail is white”, but there is no tail in the picture) and for the background also (“the fence is white”). But all in all the computer vision system does a good job in what it has been trained to do: localize and describe salient regions in images in natural language [Johnson et al 2016].
Let us now apply this system to a fairly realistic dog painting from the collection of our partner museum Belvedere, Vienna.
Again many characteristics of the dog are correctly identified (“the dog is looking at the camera”, “the eye|ear of the dog”) and also “a bowl of food”. The background already provides more problems, with some confusions still comprehensible (“a white napkin”, “the curtain is white”) but others less so (“the flowers are on the wall”).
Testing the system on a more abstract painting, Belvedere’s collection highlight “The Kiss” by Gustav Klimt, yields even stranger results.
While some captions are correct (“the mans hand”, “the dress is yellow”, “the flowers are yellow|green”), others are are somewhat off (“the hat is on the mans head”) or just completely wrong (“the picture is white”, “the wall is made of bricks”, “a black door”, “a window on the building”). The essential aspect of the painting, a man and a women embracing, is not comprehended at all. Of course this is understandable since the DenseCap system has been trained on 94,000 photo images (plus 4,100,000 localized captions) but not on fine art paintings, which explains that it cannot generalize to more abstract forms of art.
On the other hand, even if an image analysis system could perfectly detect that “Caesar am Rubicon” shows a dog looking at sausage in a bowl on a table, it would still not grasp the meaning of the painting: Caesar is both the name of the dog and the historical figure who crossed the Rubicon which was a declaration of war on the Roman Senate ultimately leading to Caesar’s ascent to Roman dictator. Hence “crossing the Rubicon” is now a metaphor that means to pass a point of no return.
The same holds for Gustav Klimt’s “The Kiss”. Even if the image analysis system were not fooled by Klimt’s use of mosaic-like two-dimensional Art Nouveau organic forms and would be able to detect two lovers embracing in a kiss, it would still not grasp the significance of the decadence conveyed by the opulent exalted golden robes or the possible connotation to the tale of Orpheus and Eurydice.
The DAD project is about exploring the influence of Artificial Intelligence on the art of curating. From a curatorial perspective, grasping the semantic meaning of works of art is essential to build curatorial narratives that are not just based on a purely aesthetic procedure. See our previous blogposts  on such a semantic driven approach towards the collection of the Belvedere, where we chose to analyse text about the paintings rather than the paintings themselves.
On 12.11.2020 DAD’s Niko Wahl presented our intermediate results on the third research day of the Academy of Fine Arts, Vienna. Due to COVID-19, this event was an online ZOOM meeting. The goal of the ]a[ research days is to give an overview of all ongoing research projects at the academy including discussions with all participating colleagues.
Niko Wahl gave a short introduction of our project and an overview of DAD’ s collaborations with three different museums, where we work with an archive of an ethnological journal, a fine arts gallery and the statues in the Academy’s Glyptothek.
Since our work with the Austrian Museum of Folk Life and Folk Art and with the Belvedere, Vienna, has already been documented in previous blogposts , lets turn to the presentation of plaster casts at the Academy‘s Glyptothek, which we explored with Dusty, an off-the-shelf household robot.
Many people associate Artificial Intelligence (AI) with the development of ever more powerful and dextrous robots, along with horror scenarios of these machines taking over the planet. In reality robots are a small part of AI which is rather dominated by machine learning software solutions powering your Internet search engine, the natural language interface to your mobile phone, online music, movie and product recommendations and many other everyday technologies.
On the other hand, many people already own robots with limited forms of AI, for instance vacuum cleaning robots. What if we confront such a household robot with a – supposedly obsolete – museum collection of historic plaster copies of famous statues, whose very physis seems to be made of dust.
The robot takes its own route through the museum space. Following its built-in algorithms it perpetually finds new ways through the collection. It seemingly decides for itself in what order to visit the museum objects, all the time metaphorically internalizing the objects of art while inhaling their dust.
Other visitors are free to follow the robot on its path through the museum space engaging with its exhibition narrative. They might benefit form surprising relationships between objects of art established by the often creative course of the robot. Smart last generation vacuum cleaning robots are able to share their sensory experiences with others of their kind. These shared experiences usually are measurements of objects and how to avoid them when traversing a room. But what if this cloud communication, usually not accessible to us, deals with objects of art instead of everyday items? Will meeting David or the Pieta change the robots’ discourse? What if the robot meets a portrait of itself?
Do you remember Dusty, the vacuum cleaner robot that explored a model version of the Glyptothek during this spring’s COVID19 related lockdown? This summer Dusty was able to experience the real Glyptothek, using its somewhat limited artificial intelligence, basically trying to avoid obstacles on its way through the maze of shelves full of plaster casts.
The Glyptothek of the Academy of Fine Arts, Vienna, is a collection of plaster casts dating back to the late 17th century. Its main task was to serve as study material for Academy students, containing copies of a canon of world renown sculptures, ranging from plaster casts of Egyptian originals to copies of Greek and Roman, medieval, renaissance and historism statues. This collection of copies of works of art can be seen as an early analog blueprint of digital collections: the Glyptothek made the essence of European sculpture available to local audiences, who could enjoy international pieces of art without leaving their home town, very much like today‘s internet population can access digital images of the world‘s artistic heritage at the click of their handheld device.
Speaking of digital images, the above image of Dusty in the Glyptothek actually is a digital copy of an analog photograph, which in itself is an analog copy of a plaster cast which is a copy of a statue which is a copy of a real (or imagined) person …