since Facebook’s algorithm just banned a robin Christmas card for being “too sexual” let’s look at how Inception3 does on classifying art images.
An easy and fun way of doing this is to go to: https://transcranial.github.io/keras-js/#/inception-v3
Here you can run the entire model interactive in the browser without any setup required, and there’s also a nice graphical flow through of the model execution.
Alternatively go to https://www.tensorflow.org/tutorials/image_recognition for instructions on getting the latest code.
First off, as a baseline, this fairly natural painting of zebra gives a reassuringly high probably recognition of being zebra:
Though when I run the same model from tensorflow.org I get:
bee eater (score = 0.19032)
bittern (score = 0.05935)
jacamar (score = 0.02757)
goldfinch, Carduelis carduelis (score = 0.02259)
robin, American robin, Turdus migratorius (score = 0.02180)
Well they are birds and robin is on the list, though much more likely to be a bee eater..
Let’s try another one:
Not a robin. Well ok, maybe it didn’t get a lock on the head due to the shape of the image:
[actually the original is 1 metre tall, which is rather large for a Robin but we didn’t tell Inception3 about that]
If we crop and center the image to a square image we get a better result… still not sure what it is but probably a bird:
on the command line I get:
hornbill (score = 0.28264)
cock (score = 0.08676)
bee eater (score = 0.06641)
macaw (score = 0.05226)
tray (score = 0.02220)
This tiger was recognised much better than I expected:
Tiger face painting is classified as mask, ok that’s not too bad:
This mural is not realistic but we can certainly tell it’s a representation of a tiger, more difficult for the computer though:
to be honest I don’t really recognise this as a tiger, it seems to be a taxidermy disaster but Inception3 recognises it, albeit with lower confidence level:
Well the model is limited by the training data:
- 14 million images, at 299×299 pixels or less, matched to 1,000 nouns is not small but it is limited (for example Apple’s chinese handwriting recognition was trained on 10s of millions of images to recognise 30,000 characters).
- this model is not trained to recognise artistic representations of objects, only photos of real objects
It’s possible the model has learned that the face with stripes is a tiger and that the picture with the painted face is then marked as 0% tiger as the long hair is a definite negative indication of tiger-ness.
But it looks like the model may not learned that a bird with an orange-red front is a robin.
Maybe Facebook’s model did though: perhaps it correctly identified the card as a picture of a “robin red breast” … and then the next algorithm check the text and decided that a picture of a red breast was not acceptable…