← back to the blog


Building DIY Human Action Detectors With Matroid

Posted on October 4th, 2018

[This article originally appeared at Synced Review and is reproduced here with permission]

Computers are now excellent at recognizing images of human faces, cats and dogs — but struggle when it comes to detecting continuous actions, for example determining if a character in a video might be “dancing the tango.” Computers also fall short in detecting nuanced expressions of human emotions.

Matroid is a California-based startup focused on AI video monitoring and detector tools and services. Founder and CEO Reza Zadeh tells Synced the company’s advanced computer vision technologies can detect subjects’ actions, emotions, and other personal characteristics.

image (65).png
Matroid real-time human emotion tracking

For example, to build a specialized detector to identify instances of “man holding guns” in video footage, Zadeh creates a new project at Matroid.com, selects example images from the Matroid cloud gallery, adds similar images and videos from additional sources, and clicks “Create the Detector.” In a few minutes, the new detector is marking all scenes in a Clint Eastwood film that contain a “man holding guns.”

No machine learning expertise is needed, and although the process may take longer for more unusual detector requests, Zadeh says overall it’s about as easy as using Photoshop.

Founded in 2016, Matroid has attracted interest from media companies and security customers. “The market is growing because you make things possible that were not possible before,” says Zadeh. For example, the Internet Archive, a San Francisco–based nonprofit digital library, uses Matroid services to track celebrities’ and politicians’ appearances on television. In the security field, Matroid can monitor camera feeds and automatically send a notification when a particular person appears, which can be a helpful screening tool for human receptionists.

“Allowing computers to see as humans do” is Zadeh’s goal. An adjunct professor at Stanford University, Zahed has been exploring machine learning since he interned with Google AI research at age 18.

P3240226.jpg
Matroid Founder and CEO Reza Zadeh speaks at Stanford University

Matroid now has over 20 engineers and has received US$13.5 million in funding from Intel and New Enterprise Associates. “It is a very exciting time to be in computer vision because of the way computer vision has been completely overturned by machine learning. I’ve never seen such a big impact of machine learning in any field, and it is sort of building on itself rapidly as we detect more and more things,” says Zadeh.

But the computer vision battle is heating up — hundreds of hopefuls have hopped on the bandwagon over the last several years. In China, computer vision startups raise hundreds of millions of dollars to drive research innovations. They are also encouraged by the Chinese government to open-source their platforms, algorithms, and datasets.

US tech giants Google, Microsoft, and Facebook have also developed a number of machine learning open source libraries and tools to benefit developers, and are revamping their services and smart products with AI algorithms.

Matroid has its own approach. Its flagship product Matroid Studio targets business customers without machine learning knowledge, enabling them to design AI models by just clicking and dragging. Matroid is also developing a detector marketplace, with thousands of production-level detectors created and shared by users.

“Maybe today we’re not good at detecting something like a dent in a car, but probably tomorrow we will be good at it because one of our users makes a detector,” says Zadeh. He tells Synced that some user projects surprised even him, such as detectors for “weight lifting” or “lying down.”

屏幕快照 2018-08-17 下午1.18.13.png
User-created detector for “lie/sleep”

Matroid On-Premise is a cloud-based workstation developers can use to monitor video streams for up-to-the-minute alerts about people, events, or logos appearing across a set of TV channels. There is also an on-camera version for large-scale deployment on embedded devices such as video surveillance cameras.

Zadeh says technical issues are not the main challenge facing Matroid, that the barrier to AI adoption, particularly in the US and Europe, lies more in a lack of appreciation for its potential: “This is such a new technology, people are not that creative with it yet.”

Also, AI is often portrayed negatively to the general public. Amazon is reportedly providing its facial recognition services to law enforcement, which has sparked criticism from civil rights organizations. A recent Brookings survey discovered that almost 50 percent of Americans worry that AI will reduce personal privacy.

There are also questions of accuracy and bias. Last month an American Civil Liberties Union report claimed that Amazon’s facial recognition tech had wrongly matched 28 members of Congress to criminal mugshots. (Amazon fired back that the ACLU had simply misused the method.) There could be similar misidentification risks with Matroid if settings are incorrect or training data contains too much noise. Matroid has developed a number of tools to collect user feedback on such issues.

Zadeh describes Matroid’s mission — “detecting everything” — as an ongoing process, especially when it comes to obscure or arcane human actions: “We have a long way to go. For all verbs [actions] to be detected, eventually we will need full AI, which I don’t think is going to come around until many decades from now — but probably in my lifetime!”