Motivation BioNav Framework Navigation Cost Models Algorithms Experiments Future Work BioNav Effective Navigation on Query Results of Biomedical Databases Abhijith Kashyap 1 Vagelis Hristridis 2 Michalis Petropoulos 1 Sotiria Tavoulari3 1 Dept of Computer Science and Engineering University at Buffalo SUNY 2 School of Computing and Information Sciences Florida International University 3 Department of Pharmacology Yale University September 8 2008 Motivation BioNav Framework Navigation Cost Models Algorithms Experiments Future Work M OTIVATION Exploratory queries are increasingly becoming a common phenomenon in life sciences e g search for citations on a given keyword on PubMed These queries return too many results but only a small fraction is relevant the user ends up examining all or most of the result tuples to find the interesting ones Can happen when the user is unsure about what is relevant e g user is looking for articles on a broad topic cancer query returns over 2 million citations on PubMed This phenomenon is commonly referred to as information overload Motivation BioNav Framework Navigation Cost Models Algorithms Experiments Future Work M OTIVATION Exploratory queries are increasingly becoming a common phenomenon in life sciences e g search for citations on a given keyword on PubMed These queries return too many results but only a small fraction is relevant the user ends up examining all or most of the result tuples to find the interesting ones Can happen when the user is unsure about what is relevant e g user is looking for articles on a broad topic cancer query returns over 2 million citations on PubMed This phenomenon is commonly referred to as information overload Motivation BioNav Framework Navigation Cost Models Algorithms Experiments Future Work M OTIVATION Exploratory queries are increasingly becoming a common phenomenon in life sciences e g search for citations on a given keyword on PubMed These queries return too many results but only a small fraction is relevant the user ends up examining all or most of the result tuples to find the interesting ones Can happen when the user is unsure about what is relevant e g user is looking for articles on a broad topic cancer query returns over 2 million citations on PubMed This phenomenon is commonly referred to as information overload Motivation BioNav Framework Navigation Cost Models Algorithms Experiments Future Work M OTIVATION Exploratory queries are increasingly becoming a common phenomenon in life sciences e g search for citations on a given keyword on PubMed These queries return too many results but only a small fraction is relevant the user ends up examining all or most of the result tuples to find the interesting ones Can happen when the user is unsure about what is relevant e g user is looking for articles on a broad topic cancer query returns over 2 million citations on PubMed This phenomenon is commonly referred to as information overload Motivation BioNav Framework Navigation Cost Models C OMMON APPROACHES TO AVOID INFORMATION OVERLOAD Ranking Categorization Algorithms Experiments Future Work Motivation BioNav Framework Navigation Cost Models C OMMON APPROACHES TO AVOID INFORMATION OVERLOAD Ranking Categorization Algorithms Experiments Future Work Motivation BioNav Framework Navigation Cost Models Algorithms Experiments C ATEGORIZATION IN INFORMATION SYSTEMS Assumptions Tuples in the database are annotated with one or more categories or concepts The set of concepts are arranged in a concept hierarchy Example Each citation in PubMed is associated with several concepts from the MeSH Medical Subject Headings hierarchy typically 12 to 20 Users querying the database are familiar with the controlled vocabulary of the concept hierarchy Future Work Motivation BioNav Framework Navigation Cost Models Algorithms Experiments C ATEGORIZATION IN INFORMATION SYSTEMS Assumptions Tuples in the database are annotated with one or more categories or concepts The set of concepts are arranged in a concept hierarchy Example Each citation in PubMed is associated with several concepts from the MeSH Medical Subject Headings hierarchy typically 12 to 20 Users querying the database are familiar with the controlled vocabulary of the concept hierarchy Future Work Motivation BioNav Framework Navigation Cost Models Algorithms Experiments Future Work Q UERY R ESULT NAVIGATION NAIVE A PPROACH GoPubMed Create the Navigation Tree as follows Extract the set S of concepts annotating tuples in the query result set Q Construct the minimal sub concept hierarchy tree T that covers all concepts in S Motivation BioNav Framework Navigation Cost Models Algorithms Experiments Q UERY R ESULT NAVIGATION NAIVE A PPROACH GoPubMed Example Section of Navigation Tree for query Prothymosin 313 results Future Work Motivation BioNav Framework Navigation Cost Models Algorithms Experiments Q UERY R ESULT NAVIGATION NAIVE A PPROACH GoPubMed Problems Massive size of the Navigation Tree MeSH has over 48000 concept nodes 313 results span over 3000 of these concepts Large number of duplicate tuples Each tuple is annotated with 12 20 MeSH concepts Total tuple count is over 5000 Effort required to navigate the query results increases Future Work Motivation BioNav Framework Navigation Cost Models Algorithms Experiments Q UERY R ESULT NAVIGATION NAIVE A PPROACH GoPubMed Problems Massive size of the Navigation Tree MeSH has over 48000 concept nodes 313 results span over 3000 of these concepts Large number of duplicate tuples Each tuple is annotated with 12 20 MeSH concepts Total tuple count is over 5000 Effort required to navigate the query results increases Future Work Motivation BioNav Framework Navigation Cost Models Algorithms Experiments Q UERY R ESULT NAVIGATION NAIVE A PPROACH GoPubMed Problems Massive size of the Navigation Tree MeSH has over 48000 concept nodes 313 results span over 3000 of these concepts Large number of duplicate tuples Each tuple is annotated with 12 20 MeSH concepts Total tuple count is over 5000 Effort required to navigate the query results increases Future Work Motivation BioNav Framework Navigation Cost Models Algorithms Experiments Future Work Q UERY R ESULT NAVIGATION DYNAMIC A PPROACH BioNav Example Navigation steps for query Prothymosin Only a selective set of descendents is shown Motivation BioNav Framework Navigation Cost Models Algorithms Experiments Future Work Q UERY R ESULT NAVIGATION DYNAMIC A PPROACH BioNav Example Navigation steps
View Full Document