The Painter By Numbers competition challenges Kagglers to examine pairs of paintings and predict whether the paintings are by the same artist. The exciting thing about constructing the competition in this way is that instead of learning to label paintings as 'van Gogh' or 'Rembrandt', the algorithm is learning how to distinguish between artists. This means that the algorithm can be used to extrapolate to artists whose work it has never been trained on.
After participating in a dozen or so Kaggle competitions, I was excited to design and run this Kaggle competition myself!
Building a classification algorithm that can extrapolate to new classes
The pairwise comparison aspect of the competition motivates using a "siamese" neural network. In a siamese neural network, there are two inputs which are processed using the same neural network. The outputs of these two networks are then combined. I prepared some sample code for training a siamese neural network on the MNIST data set: mnist_siamese_cnn.py. Desirable improvements to existing image classification algorithms I hoped that this challenge would motivate people to make two kinds of improvements to imaging algorithms:
Possible strategies that might be worth trying:
|
Leakage
I've identified two possible sources of leakage and I'm curious to see how significant they will be as the competition progresses:
In addition, I've also wondered if the works in the data set are truly high enough resolution - I certainly anticipate that algorithms would perform better with higher resolution images. But hey, let's see what we can do with the data that we do have! Including pixel density and resolution informationOn the Kaggle forum, user Guilhermo Folego pointed out that it is desirable to have a data set where the pixel densities / unit physical area are available. Unfortunately, wikiart does not always include the physical dimensions of each painting. I decided that the data set as it stands is still interesting - after all, people can differentiate a Vermeer from a van Gogh without having to know what resolution photographs of the paintings were taken at. It's also true that creating a data set where all the image are the same resolution would likely allow the algorithm to make better use of information at the scale of brush strokes. The existing data set contains images with a range of resolutions, so the algorithm is not likely going to be able to make use of high-resolution details. |
One of the challenges with setting up the competition was that the test set contained around 20,000 images. Asking competitors to submit predictions for every possible pair of paintings is kind of unwieldy, both because the time required to generate every prediction is onerous and because the submission file would run to several GB! In order to reduce the number of comparisons, I decided to group the paintings according to the artist style metadata available on wikiart. This would mean that the algorithm is generally being asked to do the more challenging task of evaluating pairs of paintings that are more similar: it's a more interesting challenge to see if the algorithm can distinguish Vermeer from Rembrandt than Hokusai from Rembrandt. The script that groups the paintings in the plot at right is plot_artist_style_overlaps.py |
The AUC metric was used to evaluate the algorithms. The three top-scoring algorithms all had AUCs greater than 0.9. If I had a surplus of time and ambition, I would build an app to serve pairs of paintings up to human viewers in order to determine if the algorithm is outperforming people. |
One thing I did do to compare the top algorithm to human performance was to slip a famous forger into the test set. Han van Meegeren was a Dutch artist who created several paintings in the style of Dutch masters from the 1700's. van Meegeren was bitter about the lack of commercial success he had with his own art, so he took up creating "new" paintings by old masters and he even sold some forgeries to the Nazis during World War II. I generated a pairwise comparison table for first-place winner orange-nejc's predictions for van Meegeren and Vermeer paintings in the test set (see at right).
Dinner at Emmaus (van Meegeren) |
Interior with couple and clavichord (van Meegeren) |
The Milkmaid (Vermeer) |
The Geographer (Vermeer) |
The Astronomer (Vermeer) |
Lady at the Virginals with a Gentleman (Vermeer) |
One of the goals of this competition was to determine if the algorithm could extrapolate to images by artists who didn't have work in the training set. I've computed the AUC for orange-nejc's predictions for two cases:
I should do some boot-strapping to generate meaningful error bars, but it looks like orange-nejc's algorithm is better at interpolating from artists whose work it has seen before than at extrapolating to work by unfamiliar artists. |