enter search term and/or author name
Comparing the Archival Rate of Arabic, English, Danish, and Korean Language Web Pages
Lulwah M. Alkwai, Michael L. Nelson, Michele C. Weigle
Article No.: 1
It has long been suspected that web archives and search engines favor Western and English language webpages. In this article, we quantitatively explore how well indexed and archived Arabic language webpages are as compared to those from other...
State-of-the-art encoders for inverted indexes compress each posting list individually. Encoding clusters of posting lists offers the possibility of reducing the redundancy of the lists while maintaining a noticeable query processing...
The exponential increase of online videos greatly enriches the life of users but also brings huge numbers of near-duplicate videos (NDVs) that seriously challenge the video websites. The video websites entail NDV-related applications such as...
Event-based social networking services, such as Meetup, are capable of linking online virtual interactions to offline physical activities. Compared to mono online social networking services (e.g., Twitter and Google+), such dual networks provide a...
Inverse Document Frequency (IDF) is widely accepted term weighting scheme whose robustness is supported by many theoretical justifications. However, applying IDF to word N-grams (or simply N-grams) of any length without relying on heuristics has...
Yum-Me: A Personalized Nutrient-Based Meal Recommender System
Longqi Yang, Cheng-Kang Hsieh, Hongjian Yang, John P. Pollak, Nicola Dell, Serge Belongie, Curtis Cole, Deborah Estrin
Article No.: 7
Nutrient-based meal recommendations have the potential to help individuals prevent or manage conditions such as diabetes and obesity. However, learning people’s food preferences and making recommendations that simultaneously appeal to their...
Search Result Diversification in Short Text Streams
Shangsong Liang, Emine Yilmaz, Hong Shen, Maarten De Rijke, W. Bruce Croft
Article No.: 8
We consider the problem of search result diversification for streams of short texts. Diversifying search results in short text streams is more challenging than in the case of long documents, as it is difficult to capture the latent topics of short...
Learning to Align Comments to News Topics
Lei Hou, Juanzi Li, Xiao-Li Li, Jie Tang, Xiaofei Guo
Article No.: 9
With the rapid proliferation of social media, increasingly more people express their opinions and reviews (user-generated content (UGC)) on recent news articles through various online services, such as news portals, forums, discussion groups, and...
Inferring Dynamic User Interests in Streams of Short Texts for User Clustering
Shangsong Liang, Zhaochun Ren, Yukun Zhao, Jun Ma, Emine Yilmaz, Maarten De Rijke
Article No.: 10
User clustering has been studied from different angles. In order to identify shared interests, behavior-based methods consider similar browsing or search patterns of users, whereas content-based methods use information from the contents of the...