Unformatted text preview:

1LBSC 796/INFM 718R: Week 1Introduction to Information RetrievalJimmy LinCollege of Information StudiesUniversity of MarylandMonday, January 30, 2006Information Retrieval Systems| Informationz What is “information”?| Retrievalz What do we mean by “retrieval”?z What are different types information needs?| Systemsz How do computer systems fit into the humaninformation seeking process?What is Information?| What do you think?| There is no “correct” definition| Cookie Monster’s definition:z “news or facts about something”| Different approaches:z Philosophyz Psychologyz Linguisticsz Electrical engineeringz Physicsz Computer sciencez Information scienceDictionary says…| Oxford English Dictionaryz information: informing, telling; thing told, knowledge, items of knowledge, newsz knowledge: knowing familiarity gained by experience; person’s range of information; a theoretical or practical understanding of; the sum of what is known| Random House Dictionaryz information: knowledge communicated or received concerning a particular fact or circumstance; newsIntuitive Notions| Information mustz Be something, although the exact nature (substance, energy, or abstract concept) is not clear;z Be “new”: repetition of previously received messages is not informativez Be “true”: false or counterfactual information is “mis-information”z Be “about” somethingRobert M. Losee. (1997) A Discipline Independent Definition of Information. Journal of the American Society for Information Science, 48(3), 254-269.Three Views of Information| Information as process| Information as communication| Information as message transmission and reception2One View| Information = characteristics of the output of a processz Tells us something about the process and the input| Information-generating process do not occur in isolationIbid.ProcessInputInputInputOutputOutputOutputProcess1Process2Input Output…Where’s the human?| If a tree falls in the forest, and no one is around to hear it, is information transmitted?| In the “information as process”: Yes, but that’s not very interesting to us| We’re concerned about information for human consumptionz Transmission of information from one person to anotherz Recording of informationz Reconstruction of stored informationAnother View| Information science is characterized by “the deliberate (purposeful) structure of the message by the sender in order to affect the image structure of the recipient”z This implies that the sender has knowledge of the recipient's structure| Text = “a collection of signs purposefully structured by a sender with the intention of changing image-structure of a recipient”| Information = “the structure of any text which is capable of changing the image-structure of a recipient”Nicholas J. Belkin and Stephen E. Robertson. (1976) Information Science and the Phenomenon of Information. Journal of the American Society for Information Science, 27(4), 197-204.Transfer of Information| Communication = transmission of informationThoughtsWordsSoundsThoughtsWordsSoundsEncoding DecodingSpeechWritingTelepathy?Information Theory| Better called “communication theory”| Developed by Claude Shannon in 1940’sz Concerned with the transmission of electrical signals over wiresz How do we send information quickly and reliably?| Underlies modern electronic communication:z Voice and data traffic…z Over copper, fiber optic, wireless, etc.| Famous result: Channel Capacity Theorem| Formal measure of information in terms of entropyz Information = “reduction in surprise”The Noisy Channel Model| Communication = producing the same message at the destination that was sent at the sourcez The message must be encoded for transmission across a medium (called channel)z But the channel is noisy and can distort the message| Semantics (meaning) is irrelevantSource Destinationchannelmessage Receiver messageTransmitternoise3A Synthesis| Information retrieval as communication over time and space, across a noisy channelSource DestinationTransmitter Receiverchannelmessage messagenoiseSenderRecipientEncoding Decodingstoragemessage messagenoiseindexing/writing retrieval/readingInformation HierarchyDataInformationKnowledgeWisdomMore refined and abstractInformation Hierarchy| Dataz The raw material of information| Informationz Data organized and presented in a particular manner| Knowledgez “Justified true belief”z Information that can be acted upon| Wisdomz Distilled and integrated knowledgez Demonstrative of high-level “understanding”A (Facetious) Example| Dataz 98.6º F, 99.5º F, 100.3º F, 101º F, …| Informationz Hourly body temperature: 98.6º F, 99.5º F, 100.3º F, 101º F, …| Knowledgez If you have a temperature above 100º F, you most likely have a fever| Wisdomz If you don’t feel well, go see a doctor“Retrieval?”| “Fetch something” that’s been stored| Recover a stored state of knowledge| Search through stored messages to find some messages relevant to the task at handSender RecipientEncoding Decodingstoragemessage messagenoiseindexing/writing Retrieval/readingWhat is IR?| Information retrieval is a problem-orienteddiscipline, concerned with the problem of the effective and efficient transfer of desired information between human generator and human userAnomalous States of Knowledge as a Basis for Information Retrieval. (1980) Nicholas J. Belkin. Canadian Journal of Information Science, 5, 133-143.4Modern History| The “information overload” problem is much older than you may think| Origins in period immediately after World War IIz Tremendous scientific progress during the warz Rapid growth in amount of scientific publications available| The “Memex Machine”z Conceived by Vannevar Bush, President Roosevelt's science advisorz Outlined in 1945 Atlantic Monthly article titled “As We May Think”z Foreshadows the development of hypertext (the Web) and information retrieval systemThe Memex MachineTypes of Information Needs| Retrospectivez “Searching the past”z Different queries posed against a static collectionz Time invariant| Prospectivez “Searching the future”z Static query posed against a dynamic collectionz Time dependentRetrospective Searches (I)| Ad hoc retrieval: find documents “about this”| Known item search| Directed explorationIdentify positive accomplishments of the Hubble telescope since it was launched in 1991.Compile a list of mammals that are considered to be endangered, identify


View Full Document

UMD LBSC 796 - Introduction to Information Retrieval

Download Introduction to Information Retrieval
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction to Information Retrieval and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction to Information Retrieval 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?