IWRandom Query Interfaces


Abstract

This dataset collects query interfaces of 33 deep Web sources randomly sampled from Invisible-Web.net. The purpose of this dataset is to provide a good diversity of query interfaces from various domains. As all deep Web sources from Invisible-Web.net are linearly numbered with an ID, we thus draw random samples from the set by the source ID and archive its the query interface page. The samples cover 16 (out of 18) top level domains listed in Invisible-Web.net, which includes: Art and Architecture, Bibiographics, Business and Investing, Computers and Computing, Education, Entertainment, Goverment Info and Data, US and World History, Legal and Criminal Info, Search for People, Public Record, Reference, Science, Social Science and Transportation.

Documentation

The document describes the creation and usage of this dataset.


Data files


Tasks Using This Dataset


Sources

Original Owners

Zhen Zhang, Bin He and Kevin Chen-Chuan Chang
Computer Science Department
University at Illinois at Urbana-Champaign
zhang2@uiuc.edu

Date Created: November 2003


Back to UIUC Web Integration Repository