ICQ Query Interfaces
Abstract
This data set contains query interfaces to the data sources on the Web over five domains: airfare, automobile, book, job and real estate. For each domain, 20 query interfaces are collected by utilizing two online directories. First, we search listed sources in invisibleweb.com (now www.profusion.com) which maintains a directory of hidden sources and the corresponding search Web sites. We also utilize the Web directory maintained by yahoo.com. Since yahoo.com does not focus on listing hidden sources, for the sources listed for each domain of interest, we examine if they are hidden sources and if yes, we identify their search Web sites. After search Web sites are collected, they are manually transformed into schema trees.
Documentation
The document
describes the usage of this dataset.
Data files
Tasks Using This Dataset
Sources
Original Owners
Wensheng Wu (Computer Science Department, University at Illinois at Urbana-Champaign)
Clement Yu (Computer Science Department, University at Illinois at Chicago)
AnHai Doan (Computer Science Department, University at Illinois at Urbana-Champaign)
Weiyi Meng (Computer Science Department, SUNY at Binghamton)
wwu2@uiuc.edu
Date Created: Jan 2003
Back to UIUC Web
Integration Repository