ACM Transactions on Information Systems (TOIS), Volume 31 Issue 2, May 2013

Modeling reformulation using query distributions
Xiaobing Xue, W. Bruce Croft
Article No.: 6
DOI: 10.1145/2457465.2457466

Query reformulation modifies the original query with the aim of better matching the vocabulary of the relevant documents, and consequently improving ranking effectiveness. Previous models typically generate words and phrases related to the...

Transfer joint embedding for cross-domain named entity recognition
Sinno Jialin Pan, Zhiqiang Toh, Jian Su
Article No.: 7
DOI: 10.1145/2457465.2457467

Named Entity Recognition (NER) is a fundamental task in information extraction from unstructured text. Most previous machine-learning-based NER systems are domain-specific, which implies that they may only perform well on some specific domains...

Studying the clustering paradox and scalability of search in highly distributed environments
Weimao Ke, Javed Mostafa
Article No.: 8
DOI: 10.1145/2457465.2457468

With the ubiquitous production, distribution and consumption of information, today's digital environments such as the Web are increasingly large and decentralized. It is hardly possible to obtain central control over information collections and...

Sparse hashing for fast multimedia search
Xiaofeng Zhu, Zi Huang, Hong Cheng, Jiangtao Cui, Heng Tao Shen
Article No.: 9
DOI: 10.1145/2457465.2457469

Hash-based methods achieve fast similarity search by representing high-dimensional data with compact binary codes. However, both generating binary codes and encoding unseen data effectively and efficiently remain very challenging tasks. In this...

Efficient fuzzy search in large text collections
Hannah Bast, Marjan Celikik
Article No.: 10
DOI: 10.1145/2457465.2457470

We consider the problem of fuzzy full-text search in large text collections, that is, full-text search which is robust against errors both on the side of the query as well as on the side of the documents. Standard inverted-index techniques work...