Locality sensitive hashing in fourier frequency domain for soft set containment search

Show simple item record

dc.contributor.author Roy, Indradyumna
dc.contributor.author Agarwal, Rishi
dc.contributor.author Chakrabarti, Soumen
dc.contributor.author Dasgupta, Anirban
dc.contributor.author De, Abir
dc.contributor.other 37th Conference on Neural Information Processing Systems (NeurIPS 2023)
dc.coverage.spatial United States of America
dc.date.accessioned 2023-11-08T15:16:15Z
dc.date.available 2023-11-08T15:16:15Z
dc.date.issued 2023-12-10
dc.identifier.citation Roy, Indradyumna; Agarwal, Rishi; Chakrabarti, Soumen; Dasgupta, Anirban and De, Abir, "Locality sensitive hashing in fourier frequency domain for soft set containment search", in the 37th Conference on Neural Information Processing Systems (NeurIPS 2023), New Orleans, US, Dec. 10-16, 2023.
dc.identifier.uri https://repository.iitgn.ac.in/handle/123456789/9413
dc.description.abstract In many search applications related to passage retrieval, text entailment, and subgraph search, the query and each 'document' is a set of elements, with a document being relevant if it contains the query. These elements are not represented by atomic IDs, but by embedded representations, thereby extending set containment to soft set containment. Recent applications address soft set containment by encoding sets into fixed-size vectors and checking for elementwise vector dominance. This 0/1 property can be relaxed to an asymmetric hinge distance for scoring and ranking candidate documents. Here we focus on data-sensitive, trainable indices for fast retrieval of relevant documents. Existing LSH methods are designed for mostly symmetric or few simple asymmetric distance functions, which are not suitable for hinge distance. Instead, we transform hinge distance into a proposed dominance similarity measure, to which we then apply a Fourier transform, thereby expressing dominance similarity as an expectation of inner products of functions in the frequency domain. Next, we approximate the expectation with an importance-sampled estimate. The overall consequence is that now we can use a traditional LSH, but in the frequency domain. To ensure that the LSH uses hash bits efficiently, we learn hash functions that are sensitive to both corpus and query distributions, mapped to the frequency domain. Our experiments show that the proposed asymmetric dominance similarity is critical to the targeted applications, and that our LSH, which we call FourierHashNet, provides a better query time vs. retrieval quality trade-off, compared to several baselines. Both the Fourier transform and the trainable hash codes contribute to performance gains.
dc.description.statementofresponsibility by Indradyumna Roy, Rishi Agarwal, Soumen Chakrabarti, Anirban Dasgupta and Abir De
dc.title Locality sensitive hashing in fourier frequency domain for soft set containment search
dc.type Conference Paper


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Digital Repository


Browse

My Account