DOC PREVIEW
LOGICAL FORM IDENTIFICATION FOR MEDICAL CLINICAL TRIALS

This preview shows page 1-2-3-4-5-33-34-35-36-66-67-68-69-70 out of 70 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 70 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

LOGICAL FORM IDENTIFICATION FORMEDICAL CLINICAL TRIALSbyClint A. TustisonA thesis submitted to the faculty ofBrigham Young Universityin partial fulfillment of the requirements for the degree ofMaster of ArtsDepartment of Linguistics and English LanguageBrigham Young UniversityDecember 2004Copyrightc 2004 Clint A. TustisonAll Rights ReservedBRIGHAM YOUNG UNIVERSITYGRADUATE COMMITTEE APPROVALof a thesis submitted byClint A. TustisonThis thesis has been read by each member of the following graduate committee and bymajority vote has been found to be satisfactory.Date Deryle W. Lonsdale, ChairDate David W. EmbleyDate Alan K. MelbyBRIGHAM YOUNG UNIVERSITYAs chair of the candidate’s graduate committee, I have read the thesis of Clint A. Tustisonin its final form and have found that (1) its format, citations, and bibliographical style areconsistent and acceptable and fulfill university and department style requirements; (2) itsillustrative materials including figures, tables, and charts are in place; and (3) the finalmanuscript is satisfactory to the graduate committee and is ready for submission to theuniversity library.DateDeryle W. LonsdaleChair, Graduate CommitteeAccepted for the DepartmentLynn HenrichsenDepartment ChairAccepted for the CollegeVan C. GesselDean, College of HumanitiesABSTRACTLOGICAL FORM IDENTIFICATION FORMEDICAL CLINICAL TRIALSClint A. TustisonDepartment of Linguistics and English LanguageMaster of ArtsProgramming a computer to understand natural language has become increasinglymore important as the amount of natural language in electronic format has increased. Oneof the areas where text understanding is valuable is medical literature. Most of the researchon information extraction and text understanding in medical literature has focused on med-ical abstracts and researchers have used various tools to get interesting results. While med-ical abstracts have proved very fruitful for this type of research, very limited research hasbeen done on extracting information and understanding text from medical clinical trials.Clinical trials are very important to doctors and medical organizations and programminga computer to automatically understand the data would be valuable. This thesis presentsLG-Soar, a system capable of parsing eligibility criteria in clinical trials and outputtinga semantic representation using predicate logic. This approach is different than other in-formation extraction approaches to natural language in that it uses a cognitive modelingengine to convert the parsed sentences into corresponding predicate logic forms. Initial re-sults reveal that LG-Soar is a viable system for doing natural language text extraction andunderstanding.ACKNOWLEDGMENTSI would like to thank the National Science Foundation for supporting this research.ContentsAcknowledgments viList of Tables ixList of Figures xi1 Introduction 12 Literature Review 32.1 Information Extraction Domains ....................... 32.2 Information Extraction Applications . . . . . ................ 52.3 Information Extraction Methods ....................... 72.4 Problems in Information Extraction . . . . . ................ 103 Method 153.1 Clinical Trials . . . .............................. 153.2 Syntactic Parser . . .............................. 183.3 Syntax-to-Semantics Conversion ....................... 193.4 Output Formats . . .............................. 214 System Description 234.1 Clinical Trials Corpus . . . . . . ....................... 234.2 Pre-Processing . . .............................. 234.3 Link Grammar Parser............................. 274.4 Syntax-to-Semantics Engine . . ....................... 304.5 Output Formats . . .............................. 36vii4.6 Summary . . . . . .............................. 375 Results 415.1 Evaluation Metrics .............................. 415.2 Qualitative Results .............................. 435.3 Quantitative Results .............................. 436 Discussion 456.1 Benefits of LG-Soar .............................. 466.2 Future Work . . . . .............................. 467 Conclusions 49References 53A Examples Showing LG-Soar Process 55B Examples of Incorrect Output 57viiiList of Tables2.1 Differences between dependency and link grammars ............ 82.2 Ambiguities in English . . . . . ....................... 102.3 Prepositional phrase ambiguity . ....................... 114.1 Major link types and sublinkages ....................... 294.2 Initial trace of syntax-to-semantics conversion by LG-Soar . . . . ..... 324.3 Operators for A criterion equals serious heart problems ........... 344.4 Productions used for A criterion equals serious heart problems ....... 345.1 Confusion Matrix . .............................. 415.2 Sample LG-Soar output . . . . . ....................... 435.3 Initial LG-Soar quantitative results . . . . . . ................ 446.1 Comparison of relation capture for biology . . ................ 45A.1 LG-Soar processing examples . ....................... 56B.1 Examples of incorrect output . . ....................... 58ixxList of Figures2.1 Example syntactic constituency parse . . . . ................ 123.1 Logical form identification process . . . . . . ................ 163.2 Information included in clinical trials . . . . . ................ 173.3 a of clinical trial NCT00042666 ....................... 184.1 Tagged XML file of clinical trial NCT00042666 . . . ............ 254.2 Link grammar output for A criterion equals serious heart problems ..... 284.3 LG-Soar post-processing . . . . ....................... 354.4 DRS for A criterion equals serious heart problems ............. 374.5 Final XML output . .............................. 38xixiiChapter 1IntroductionGoogle currently retrieves 4,285,199,774 web pages millions of times a day.1Moreand more people are accessing the huge amounts of data available electronically and arebecoming increasingly dependent on understanding this information. With an increase incomputing power, textual analysis and understanding has become an important area ofresearch in the fields of information extraction and natural language processing (NLP).As electronic texts become more available to researchers (and humans in general),an interesting dichotomy has emerged. On one hand, electronic text posted on the Internetcaters to users’ ability to read and analyze that information. Those interested in puttinginformation on the Internet design the data’s structure to be easy for humans to


LOGICAL FORM IDENTIFICATION FOR MEDICAL CLINICAL TRIALS

Download LOGICAL FORM IDENTIFICATION FOR MEDICAL CLINICAL TRIALS
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view LOGICAL FORM IDENTIFICATION FOR MEDICAL CLINICAL TRIALS and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view LOGICAL FORM IDENTIFICATION FOR MEDICAL CLINICAL TRIALS 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?