Finding role errors of the NCIT gene hierarchy using the NCBI
Department of Computer Science
Master of Science
Role error discovery
Due to the knowledge discovery process of gene information, details such as organism and chromosomal location should be known. Furthermore, the extensive biomedical research in genomics led to discovery of processes and diseases in which a gene plays a role. In the Gene hierarchy of the NCI Thesaurus (NCIT) such knowledge is represented by appropriate roles. However, upon review of the Gene hierarchy of the NCIT, many role errors are found. Realizing that such details are provided by another knowledge repository of NIH, the NCBI gene database, a methodology is presented to use NCBI to discover role errors for the Gene hierarchy.
For this, a web crawler was developed to retrieve the knowledge from the NCBI, so it could be represented in a compatible way with the NCIT gene roles, to facilitate comparison. The most difficult challenge is with process roles for which one gene is playing role in several processes and thus several targets exist for one gene. A procedure is developed to explore process role errors in the Gene hierarchy of the NCIT utilizing the Biological Process hierarchy of the NCIT and NCBI gene database.
njit-etd2006-063 (122 pages ~ 4,326 KB pdf)
Please complete this Feedback Form to inform us about your experience using this website. It will assist us in better serving your information needs in the future. Thank You!
Created September 8, 2008