enter search term and/or author name
Comparing the Archival Rate of Arabic, English, Danish, and Korean Language Web Pages
Lulwah M. Alkwai, Michael L. Nelson, Michele C. Weigle
Article No.: 1
It has long been suspected that web archives and search engines favor Western and English language webpages. In this article, we quantitatively explore how well indexed and archived Arabic language webpages are as compared to those from other...
State-of-the-art encoders for inverted indexes compress each posting list individually. Encoding clusters of posting lists offers the possibility of reducing the redundancy of the lists while maintaining a noticeable query processing...
GVoS: A General System for Near-Duplicate Video-Related Applications on Storm
Jiawei Jiang, Yunhai Tong, Hua Lu, Bin Cui, Kai Lei, Lele Yu
Article No.: 3
The exponential increase of online videos greatly enriches the life of users but also brings huge numbers of near-duplicate videos (NDVs) that seriously challenge the video websites. The video websites entail NDV-related applications such as...
Event-based social networking services, such as Meetup, are capable of linking online virtual interactions to offline physical activities. Compared to mono online social networking services (e.g., Twitter and Google+), such dual networks provide a...
IDF for Word N-grams
Masumi Shirakawa, Takahiro Hara, Shojiro Nishio
Article No.: 5
Inverse Document Frequency (IDF) is widely accepted term weighting scheme whose robustness is supported by many theoretical justifications. However, applying IDF to word N-grams (or simply N-grams) of any length without relying on heuristics has...