Making Sense of Search Results by Automatic Web-page Classifications

Ben Choi
Computer Science
College of Engineering and Science
Louisiana Tech University, USA

Abstract: This paper reports the development of a system for automatically organizing Internet web pages into meaningful categories. The aim of the system is to allow Internet users to find useful information in less time. The current problem with using the Internet is how to find the information that we need. With the explosive growth in the Internet, the information overload situation is getting worse. The proposed system automatically classifies web pages based on three types of information: (1) The system analyzes organizational information among web pages (inter-web-page relationship), such as an URL and links within a web page. (2) It analyzes the meta-web-page information such as data contained in META tags and formatting data of a web page. And (3), it analyzes web-page-content information such as keywords and phrases in the content of a web page. Our results show that combining all three types of information provides better accuracy.



Full Paper:

Choi, Ben (2001) “Making Sense of Search Results by Automatic Web-page Classifications,” Proc. of WebNet 2001 -- World Conference on the WWW and Internet, pp.184-186.