ICQ Query Interfaces
This data set contains query interfaces to the data sources on the Web over five domains: airfare, automobile, book, job and real estate. For each domain, 20 query interfaces are collected by utilizing two online directories. First, we search listed sources in invisibleweb.com (now www.profusion.com) which maintains a directory of hidden sources and the corresponding search Web sites. We also utilize the Web directory maintained by yahoo.com. Since yahoo.com does not focus on listing hidden sources, for the sources listed for each domain of interest, we examine if they are hidden sources and if yes, we identify their search Web sites. After search Web sites are collected, they are manually transformed into schema trees.
describes the usage of this dataset.
Tasks Using This Dataset
Wensheng Wu (Computer Science Department, University at Illinois at Urbana-Champaign)
Clement Yu (Computer Science Department, University at Illinois at Chicago)
AnHai Doan (Computer Science Department, University at Illinois at Urbana-Champaign)
Weiyi Meng (Computer Science Department, SUNY at Binghamton)
Date Created: Jan 2003
Back to UIUC Web