ICQ Query Interfaces


Abstract

This data set contains query interfaces to the data sources on the Web over five domains: airfare, automobile, book, job and real estate. For each domain, 20 query interfaces are collected by utilizing two online directories. First, we search listed sources in invisibleweb.com (now www.profusion.com) which maintains a directory of hidden sources and the corresponding search Web sites. We also utilize the Web directory maintained by yahoo.com. Since yahoo.com does not focus on listing hidden sources, for the sources listed for each domain of interest, we examine if they are hidden sources and if yes, we identify their search Web sites. After search Web sites are collected, they are manually transformed into schema trees.

Documentation

The document describes the usage of this dataset.


Data files


Tasks Using This Dataset


Sources

Original Owners

Wensheng Wu (Computer Science Department, University at Illinois at Urbana-Champaign)
Clement Yu (Computer Science Department, University at Illinois at Chicago)
AnHai Doan (Computer Science Department, University at Illinois at Urbana-Champaign)
Weiyi Meng (Computer Science Department, SUNY at Binghamton)
wwu2@uiuc.edu

Date Created: Jan 2003


Back to UIUC Web Integration Repository