A New Search Engine Integrating Hierarchical Browsing and Keyword Search
Charles Ling, Xiao Li and Da Kuang
The original Yahoo! search engine consists of manually organized topic hierarchy of webpages for easy browsing. The modern search engines (such as Google and Bing), on the other hand, return a flat list of webpages based on keywords. It would be ideal if hierarchical browsing and keyword search can be seamlessly combined. The main difficulty in doing so is to automatically (i.e., not manually) classify and rank a massive number of webpages into various hierarchies (such as topics, media types, regions of the world). In this paper we report our attempt towards building this integrated search engine, called SEE (Search Engine with hiErarchy). We implement a hierarchical classification system based on Support Vector Machines, and embed it in SEE. We also design a novel user interface that allows users to dynamically adjust their desire for a higher accuracy vs more results in any (sub)category of the hierarchy. Though our current search engine is still small (indexing about 1.24 million webpages), the results, including a small user study, have shown a great promise for integrating such techniques in the next-generation search engine.