國立台灣科技大學 資訊工程系所

智慧型系統實驗室


研 究 資 源


Intelligent Systems on the Web (Data Mining)


Tutorial/Survey


Websites:
(1)seach engine tutorial for web designer:
http://www.northernwebs.com/set/summary.html
(2)pandia search world:
http://www.pandia.com/searchworld/
(3)search engine watch:
http://searchenginewatch.com/
(4)Beginners' Central http://www.northernwebs.com/bc/

Papers:
(1)
 "A Survey of Information Retrieval and Filtering Methods" Christos Faloutsos and Douglas Oard. 1996. A Survey of Information Retrieval and Filtering Methods. Technical Report, Information Filtering Project, University of Maryland, College Park, MD. http://citeseer.nj.nec.com/faloutsos96survey.html
(2)
 "WWW Robots and Search Engines" Heinonen, K. Hatonen, and K. Klemettinen, "WWW robots and search engines." Seminar on Mobile Code, Report TKO-C79, Helsinki University of Technology, Department of Computer Science, 1996. http://citeseer.nj.nec.com/heinonen96www.html
(3)
 "Information Retrieval on the Web" M. Kobayashi and K. Takeda, Information retrieval on the Web, IBM Research Report, RT0347, April 2000. http://citeseer.nj.nec.com/kobayashi00information.html
(4)
"A Modern Approach to Searching the World Wide Web: Ranking Pages by Inference over Content" http://citeseer.nj.nec.com/467021.html
(5)
"Searching the Web" A. Arasu, J. Cho and S.Raghavan, ACM Transactions on Internet Technology, Vol.1, pp. 2-43, 2001
(6)
 "Local Searching the Internet" M. Angelaccio and B. Buttarazzi, IEEE Internet Computing, Vol. 6, pp. 25-33, 2002
(7)
 "Ranking function optimization for effective web search by genetic progeamming : an empirical study" W. Fan, M. D. Gordon, P. Pathak, W. Xi and E. A. Fox, Proceedings of 37th Hawaii International Conference on System Sciences (HICSS), 2004.

Search Engine
A search engine is a database system designed to index internet addresses (urls, usenet, ftp, image locations etc). The typical search engine contains a special program often called a spider , the spider accepts a url, it then goes to that website and retrieves a copy of the file found there. Sometime later, it the search engine will process that copy of the file, distilling it down to the bare essential data it needs for the data base. While most search engines request both a url and an email address, the search engine makes the determination as to what data ends up in the database. In short, given a url, an automated process occurs which results in your site being included into the index.

Meta Search Engine
Unlike the individual search engines, meta-search engines do not have their own databases. They just send queries to multiple web search engines. many meta-search engines integrate search results, eliminate duplications, and rank the results through their own criteria. Meta-search engines are not designed for exhaustive searches. Most meta-search engines only make use of the top 10 to 100 hits from each of the search engines . While this is sufficient for most searches, individual search engines must be consulted if a user must search all of the hits and can’t reformulate the query to avoid a large number of hits.

Intelligent Search Agent
“Intelligent Web Search Agent as an autonomous, goal-directed process that is situated in, is aware of, and reacts to its World Wide Web environment”. It uses protocols to cooperate with other agents (software or human) to accomplish its tasks. Intelligent Web Search Agents understand information on individual document .

Special Issue


AI magazine Volume 18,issue 2

Application


(1)AltaVista:
http://www.altavista.com
(2)Google
 http://www.googole.com
(3)HotBot
 http://www.hotbot.com
(4)Teoma
http://www.teoma.com
(5)WiseNut
http://www.wisenut.com
(6)Yam
http://www.yam.com
(7)yahoo
http://www.yahoo.com
(8)Researchindex
http://citeseer.nj.nec.com/cs

Articles


Google Hires a Grown-Up
http://www.business2.com/articles/mag/0,1640,36736,FF.html

Search Engine Watch members receive special editions of some SearchDay articles and other valuable benefits. Learn more via the URL below: http://searchenginewatch.com/about/subscribe.html?source=sday search engine size http://searchenginewatch.com/reports/sizes.html

Google Fires New Salvo in Search Engine Size Wars http://searchenginewatch.com/searchday/01/sd1211-google.html

Google Goes for Stop Words
http://searchenginewatch.com/searchday/01/sd1203-google.html

How Search Engines Use Link Analysis http://searchenginewatch.com/searchday/01/sd1219-links.html

Google Unveils More of the Invisible Web http://searchenginewatch.com/searchday/01/sd1031-google-files.html

Wisenut, the Google Killer? Nah... http://searchenginewatch.com/searchday/01/sd0905-wisenut.html

New at Google
http://searchenginewatch.com/searchday/01/sd0827-news.html

Google Polishes its Image http://searchenginewatch.com/searchday/01/sd0626-google-images.html

Google Restores Usenet Archive http://searchenginewatch.com/searchday/01/sd0509-googlegroups.html