Enhancing PubMed Search Results with Semantic Annotation Using Augmented Browsing
Hong-Jie Dai, Po-Ting Lai, Richard Tzong-Han Tsai and Wen-Lian Hsu
In this paper, we describe how we integrated an AI system into the PubMed search website using augmented browsing technology. Our system dynamically enriches the PubMed search results displayed in a user’s browser with semantic annotation provided by several natural language processing (NLP) subsystems, including a sentence splitter, a part-of-speech tagger, a gene mention recognizer, a section categorizer and a gene normalizer (GN), which maps each gene mention to a unique gene database ID. After our browser extension is installed, a client browser can run userscripts that modify the HTML content of the PubMed search results page on the fly to provide additional information on gene and gene products identified by our NLP subsystems. For example, one script hyperlinks identified genes to their EntrezGene database entries so users can see detailed information on the genes in a pop-up window. GN involves three main steps: candidate ID matching, false positive filtering and disambiguation, which are highly dependent on each other. We therefore propose a joint model using a Markov logic network (MLN). The experimental results show that our joint model outperforms a baseline system that executes the three steps separately.