← back to the blog


Wired Article Discussing Matroid Detectors

Posted on December 3rd, 2018

 

This article originally appeared on Wired, with an excerpt below.

 

Image-recognition classifiers—like the one Tumblr ostensibly deployed—are trained to spot explicit content using datasets typically containing millions of examples of porn and not-porn. The classifier is only as good as the data it learned from, says Reza Zadeh, an adjunct computer science professor at Stanford University and the CEO of computer vision company Matroid. Based on looking at examples of flagged content users at posted on Twitter, he says it’s possible Tumblr neglected to include enough instances of things like NSFW cartoons in its dataset. That might account for why the classifier mistook Burstein’s patent illustrations for adult content, for example. “I believe they've forgot about adding enough cartoon data in this case, and probably other types of examples that matter and are SFW,” he says.

“Computers are only recently opening their eyes, and it's foolish to think they can see perfectly.”

REZA ZADEH, MATROID

WIRED tried running several Tumblr posts that were reportedly flagged as adult content through Matroid’s NSFW natural imagery classifier, including a picture of chocolate ghosts, a photoof Joe Biden, and one of Burstein’s patents, this time for LED light-up jeans. The classifier correctly identified each one as SFW, though it thought there was a 21 percent chance the chocolate ghosts might be NSFW. The test demonstrates there’s nothing inherently adult about these images—what matters is how different classifiers look at them.

 

“In general it is very easy to think ‘image recognition is easy,’ then blunder into mistakes like this,” says Zadeh. “Computers are only recently opening their eyes, and it's foolish to think they can see perfectly.”

This article originally appeared on Wired, with an excerpt above.