Skip to content

Introducing Similarity Search


John Goddard | March 9th, 2019


Cheap storage, ubiquitous connected cameras, and an explosion of user generated content have led to an exponential increase in the amount of visual media on the internet and company servers. Unfortunately, the tools to explore this content have lagged behind: the images and videos are there, but how do you find what you’re looking for?

Enter Matroid Similarity Search. With Similarity Search, you can easily connect and search large collections of images and videos. Upload a query image and then select the part you’re interested in, and you’ll instantly see a list of the most similar media in your collection:

Alternatively, you can query by detector label scores to get a list of images with scores above a set threshold for specified labels:

Your data, your search indexes

So what’s the fuss? Google and other companies have had reverse image search for years. Searching by image isn’t new, but Matroid Similarity search represents two key breakthroughs in visual media search products: 

  1. You can use it on your data.
  2. You can create a custom search index, tailored to find exactly what you’re interested in.

Google reverse image search is a fantastic tool just like other image editor. if you’re looking for images publicly available on the internet, but for most of our customers that’s not the case. Our users have mountains of their own, often sensitive, media hosted on private servers, inaccessible by existing search tools. Drawing on our expertise in developing intuitive and scalable machine learning tools, Matroid makes it simple to connect this media as a collection, and to create an index only searchable by authorized users at your organization.

With  existing visual search tools, you’re also limited by whatever indexes those tools choose to use. For example, say you make a query with a picture of a man wearing red hat while lying down. You might be interested in any of the following types of results:

  • Other pictures of the same man
  • Other pictures of people wearing similar red hats
  • Other pictures of people lying down

A traditional search tool won’t understand your intentions; it may give results of one type when you’re actually interested in another. Matroid solves this by letting you index your collection with any of our detectors and then allowing you to choose which index to use when searching.

So, in the above example, if you were interested in finding the same man, you could index your collection with a facial recognition detector. If you were interested in red hats, you could index the collection with a clothing detector. If you were interested in other people lying down, you could index with the Matroid Actions detector. But what happens if there isn’t a detector that can give you the kind of search results you’re looking for?

Like much of the Matroid product, Similarity Search is bolstered by the ability to quickly create, experiment with, and deploy a detector that can find whatever you’re interested in. Say you’re now actually interested in finding people with similar facial hair to the man in your query image, but there isn’t an existing facial hair detector. In a matter of minutes, you can create one yourself, and then reindex your collection with it. The new index will differentiate media in your collection based on the features most relevant to facial hair, and you’ll be able to find other examples of similarly majestic handlebar mustaches. And if the index doesn’t give you the results you were hoping for at first, you can redo and tweak your detector until it does.

In essence, when you index a collection with a custom detector, you are searching your collection using the features learned from many images at the same time, which is also first of its kind functionality. 

Getting started with Similarity Search

Over the next few months, we plan on releasing a set of public collections that any user can explore. To start, we’ve made the Open Images Dataset available for querying. Check it out to see exactly what (and who) is in one of the computer vision’s world’s most important datasets.

Alternatively, you can create a collection of your own from a public S3 bucket. For Matroid streams users, when you create a new monitoring, we’ll automatically create a collection from the monitoring detections. By default, all collections will be indexed with our all purpose General Detector and our generic Face Indexer, but you’ll be able to add indexes with any detectors.

If you’re interested in connecting a collection from another source, please contact us at [email protected] for a consultation.

Enjoy exploring large visual datasets in a new way!

Download Our Free

Step By Step Guide

Building Custom Computer Vision Models with Matroid

Dive into the world of personalized computer vision models with Matroid's comprehensive guide – click to download today