Skip to content

Increasing The Pace Of Annotations – AI With A Human In The Loop

Photo of Guy Kolb
Hosted By
Guy K.
 Increasing The Pace Of Annotations – AI With A Human In The Loop

Details

We know that dataset design has a huge impact on the quality of our models. We also know that obtaining enough high-quality labeled data is difficult, time consuming, and oftentimes expensive. When it comes to manual labeling, it’s essential to put in the effort where it counts. In this talk, we present a system we developed at Gong for interactive labeling and simultaneous training of text classifiers. To make every label count, we leverage Unsupervised Retrieval and Active Learning algorithms: Active learning methods aim to reduce the sample complexity by selecting which samples to label. Sentence embeddings are used for effective retrieval based on semantic similarity to ‘increase the signal-to-noise ratio’, i.e., retrieve a pool of samples that are likely to be associated with the positive class. Finally, by incorporating efficient sampling methods we boost diversity in the dataset. This system is used internally to build our text classifiers. Moreover, it is efficient enough in low-resource settings that it allows our users, who are typically not data savvy, to build their own classifiers. This framework can be implemented in other domains, such as the medical domain.

Photo of Nucleai Academy: Medical AI Meetup group
Nucleai Academy: Medical AI Meetup
See more events
Ein Zeitim St 4
Ein Zeitim St 4 · Tel Aviv-Yafo