Zum Inhalt springenZur Suche springen


Anomaly Detection aims at identifying patterns in data that are significantly different to what is expected. This problem is inherently a binary classification problem that classifies examples either as in-distribution or out-of-distribution, given a sufficiently large sample from the in-distribution (training set). A natural approach to such out-of-distribution detection (OOD) problems is to learn a density model from the training data and compute the likelihood ratio of OOD examples to in-distribution examples. However, in practice this approach frequently fails for high-dimensional data. In our new work we present SemSAD, a simple and generic framework for detecting examples that lie out-of-distribution (OOD) for a given training set. The approach is based on learning a semantic similarity measure to find for a given test example the semantically closest example in the training set and then using a discriminator to classify whether the two examples show sufficient semantic dissimilarity such that the test example can be rejected as OOD. We are able to outperform previous approaches for anomaly, novelty, or out-of-distribution detection in the visual do- main by a large margin. In particular we obtain AUROC values close to one for the challenging task of detecting examples from CIFAR-10 as out-of-distribution given CIFAR-100 as in-distribution, without making use of label information.