Rank fusion is a powerful technique that allows multiple sources of information to be combined into a single result set. However, to date fusion has not been regarded as being cost-effective in cases where strict per- query efficiency guarantees are required, such as in web search. In this work we propose a novel solution to rank fusion by splitting the computation into two parts ? one phase that is carried out offline to generate pre-computed centroid answers for queries with broadly similar information needs, and then a second online phase that uses the corresponding topic centroid to compute a result page for each query. We explore efficiency improvements to classic fusion algorithms whose costs can be amortized as a pre-processing step, and can then be combined with re-ranking approaches to dramatically improve effectiveness in multi-stage retrieval systems with little efficiency overhead at query time. Experimental results using the ClueWeb12B collection and the UQV100 query variations demonstrate that centroid-based approaches allow improved retrieval effectiveness at little or no loss in query throughput or latency, and with reasonable pre-processing requirements. We additionally show that queries that do not match any of the pre-computed clusters can be accurately identified and efficiently processed in our proposed ranking pipeline.
We propose a Joint Neural Collaborative Filtering (J-NCF) method for recommender systems. The J-NCF model applies a joint neural network that couples deep feature learning and deep interaction modeling with a rating matrix. Deep feature learning extracts feature representations of users and items with a deep learning architecture based on a user-item rating matrix. Deep interaction modeling captures non-linear user-item interactions with a deep neural network using the feature representations generated by the deep feature learning process as input. J-NCF enables the deep feature learning and deep interaction modeling processes to optimize each other through joint training, which can improve recommendation performance. Additionally, we design a new loss function for optimization, which takes both implicit and explicit feedback, point-wise and pair-wise loss into account. Experiments on several datasets show significant improvements of J-NCF over state-of-the-art methods, with improvements of 8.24% on the MovieLens 100K dataset, 10.81% on the MovieLens 1M dataset, and 10.21% on the Amazon Movies dataset in terms of [email protected] [email protected] improvements are 12.42%, 14.24% and 15.06%, respectively. We conduct experiments to evaluate the scalability and sensitivity of J-NCF. Our experiments show J-NCF has a competitive performance on sparse datasets and inactive users compared to state-of-the-art baselines.
Question answering over knowledge base (KB-QA) aims to take full advantage of the knowledge in knowledge bases with the ultimate purpose of returning answers to questions. To access the substantial knowledge within the KB, many model architectures are hindered by the bottleneck of accurately predicting relations which connect subject entities to object entities. To break the bottleneck, this paper presents a novel framework which can be viewed as an extension to APVA-TURBO. Experimental results show a boost in performance to the APVA-TURBO approach and outperform other question answering approaches.
An intensive recent research work investigated the combined use of hand-curated knowledge sources and corpus-driven sources to learn effective text representations. The overall learning process could be run by online revising the learning objective or by offline refining an original learned representation. The differentiated impact of each of the learning approaches on the quality of the learned representations has not been so far studied in the literature. This article focuses on the design of comparable offline vs. online knowledge-enhanced document representation learning models and the comparison of their effectiveness using a set of standard IR and NLP downstream tasks. The results of quantitative and qualitative analyses show that 1) offline vs. online learning approaches have dissimilar results trends regarding the task as well as the dataset distribution counts with regard to domain application; 2) while considering relational semantics is undoubtedly beneficial, the way used to express relational constraints could affect semantic inference effectiveness. The findings of this work present opportunities for the design of future representation learning models, but also for providing insights about the evaluation of such models.