Inside visual Search Engine - By Superfish
Information retrieval approach to text analysis proved extremely efficient, ranging from simple text retrieval by Google through speech recognition by Siri and to answering tricky Jeopardy questions by IBM. Visual search became not only another way to browse images around us, but the answer to many visual recognition tasks, such as classification of objects into categories, annotation etc.
Superfish's visual search technology is in the heart of such an engine. We will share from our experience in building such an engine.
Inside Visual Search Engine. Part II - Scalable platform
Here are some numbers that Superfish platform is capable of:
- 200M searches a day
- 3000 searches a second in peak time
- Over 1B 3rd party API calls a day
- 400ms average search latency
- 500+ servers
Much like any other search engine, our visual search engine consists of two major building blocks:
- Indexing is an offline process, that indexes tens of millions of images a day across hundreds of machines, producing visual indexes with TBs of data.
- Search is a near-real-time process, that looks up for images similar to query image using the visual indexes. We are handling hundreds of millions of searches a day running on hundreds of machines, while maintaining low search latency. During the meetup we'll demonstrate our approach to indexing based on grid computing, and our home-made search services grid."
Nir Antebi, First Line manager and Grid Program manager at Intel
The theme of my presentation is to show how Intel’s cloud computing helps Intel to deliver better chips in less time using fewer resources and less money. We will study about the cloud architecture and the cloud computing benefits for the Intel’s chip designers. Users don’t need to search for the resource to run their application, if the application has unique requirements the user can just ask for such a machine (assuming that he can access the machine) he will get it. The cloud provides a standard(internal standard) way to access the resources and the user can run any type of application in the cloud. It enables resources to be shared (very important for expensive resources). For example, if there are a number of users in different projects that each need a 1024GB machine for an hour a day (on average), you can just have one such machine that the projects can share instead of each project having to buy one. Running jobs in cross-site environment means that each user has access to many more resources that he would otherwise not have.
One important point that I want to stress is that access to computing resources is transparent to the user. There are all these data centers at the back end, but all he sees is a common front end. He does not have to worry about where the resources are or how to access them. He just runs his application, secure in the knowledge that it will be executed.
Amir Di-Nur, VP R&D, Superfish
I will present Superfish and the technology
Dima Frid, Platform group lead, Superfish
In the heart of our realtime search system lies Mist, a mechanism consisting of two services:
- Orchestrator, which is a self-healing, linearly auto-scaling management system for our computational services.
- Lease, which is a fault-tolerant, HA provisioning system of the computational services to the search engine.