Studying experimental factors that impact IR performance measures is often overlooked when comparing IR systems. In particular, the effects of splitting the document collection into shards has not been examined in detail. We use the general linear mixed model framework and present a model that encompasses the experimental factors of system, topic, shard, and their interaction effects. This detailed model allows us to more accurately estimate differences between the effect of various factors. We study shards created by a range of methods used in prior work and better explain observations noted in prior work in a principled setting and offer new insights. Notably, we discover that the topic*shard interaction effect in particular is a large effect almost globally across all datasets, an observation that has not been recognized or measured before to our knowledge.
Diversity has been taken into consideration by existing Web image search engines in ranking search results. However, there is no thorough investigation of how diversity affects user satisfaction in image search. In this paper, we address the following questions: (1) How do different factors, such as content and visual presentations, affect users' perception of diversity? (2) How search result diversity affect user satisfaction with different search intents? To answer those questions, we conduct a set of laboratory user studies to collect users' perceived diversity annotations and search satisfaction. We find that the existence of nearly-duplicated image results has the largest impact on users' perceived diversity, followed by the similarity in content and visual presentations. Besides these findings, we also investigate the relationship between diversity and satisfaction in image search. Specifically, we find that users' preference for diversity varies across different search intents. When users want to collect information or save images for further usage (the Locate search tasks), more diversified result lists lead to higher satisfaction levels. The insights may help commercial image search engines to design better result ranking strategies and evaluation metrics.
Dialogue management (DM) decides the next action of a dialogue system according to the current dialogue state, and thus plays a central role in task-oriented dialogue systems. Since dialogue management requires to have access to not only local utterances, but also the global semantics of the entire dialogue session, modeling the long-range history information is a critical issue. To this end, we propose a novel Memory-Augmented Dialogue management model (MAD) which employs a memory controller and two additional memory structures, a slot-value memory and an external memory. The slot-value memory tracks the dialogue state by memorizing and updating the values of semantic slots (for instance, cuisine, price, and location), and the external memory augments the representation of hidden states of traditional recurrent neural networks through storing more context information. To update the dialogue state efficiently, we also propose slot-level attention on user utterances to extract specific semantic information for each slot. Experiments show that our model can obtain state-of-the-art performance and outperforms existing baselines.
Item-based Collaborative Filtering (short for ICF) has been widely adopted in recommender systems. By constructing a user?s profile with the items that the user has consumed, ICF recommends items that are similar to the user?s profile. With the prevalence of machine learning, significant processes have been made for ICF by learning item similarity from data. Nevertheless, we argue that most existing works have only considered linear and shallow relationship between items, which are insufficient to capture the complicated decision-making process of users. In this work, we propose a more expressive ICF solution by accounting for the nonlinear and higher-order relationship among items. Going beyond modeling only the second-order interaction (e.g., similarity) between two items, we additionally consider the interaction among all interacted item pairs by using nonlinear neural networks. We can effectively model the higher-order relationship among items, capturing more complicated effects in user decision-making. We treat this solution as a deep variant of ICF, thus term it as DeepICF. Extensive experiments verify the highly positive effect of higher-order item interaction modeling with nonlinear neural networks. Moreover, we demonstrate that by more fine-grained second-order interaction modeling with attention network, the performance of our DeepICF method can be further improved.
Image search engines differ significantly from general Web search engines in the way of presenting search results. The difference leads to different interaction and examination behavior patterns, and therefore requires changes in evaluation methodologies. However, evaluation of image search still utilizes the methods for general web search. In particular, offline metrics are calculated based on coarse-fined topical relevance judgments with the assumption that users examine results in a sequential manner. In this paper, we investigate annotation methods via crowdsourcing for image search evaluation based on a lab-based user study. Using user satisfaction as the golden standard, we make a number of interesting findings. (1) Instead of item-based annotation, annotating relevance in a row-based way is more efficient without hurting performance. (2) Besides topical relevance, image quality plays a crucial role when evaluating the image search results, and the importance of image quality changes with search intents. (3) Compared with traditional 4-level scales, the fine-grained annotation method outperforms significantly. To our best knowledge, our work is the first to systematically study how diverse factors in data annotation impact image search evaluation. Our results suggest different strategies for exploiting the crowdsourcing to get data annotated under different conditions.
In recent years, many studies extract aspects from user reviews and integrate them with ratings for improving the recommendation performance. The common aspects mentioned in a user's reviews and a product's reviews indicate indirect connections between the user and product. However, these aspect-based methods suffer from two problems. First, the common aspects are usually very sparse, which is caused by the sparsity of user-product interactions and the diversity of individual users' vocabularies. Second, a user's interests on aspects could be different with respect to different products, which are usually assumed to be static in existing methods. In this paper, we propose an Attentive Aspect-based Recommendation Model (AARM) to tackle these challenges. For the first problem, to enrich the aspect connections between user and product, besides common aspects, AARM also models the interactions between synonymous and similar aspects. For the second problem, a neural attention network which simultaneously considers user, product and aspect information is constructed to capture a user's attention towards aspects when examining different products. Extensive quantitative and qualitative experiments show that AARM can effectively alleviate the two aforementioned problems and significantly outperforms several state-of-the-art recommendation methods on top-N recommendation task.