CSCI 5417-IR Fall 2009 Quiz 1 Name: ________________________ On my honor, as a University of Colorado at Boulder student, I have neither given nor received unauthorized assistance on this work. . 1. (5 Points) What is your favorite movie? 2. (5 Points) True/False: Dividing inverted indexes into separate dictionary and postings files is done in order to speed up the process of creating an index. 3. (5 Points) In what way should postings be ordered in order to facilitate the processing of boolean queries? Why? 4. (10 Points) The formula on the attached page describes Lucene’s default document scoring mechanism. Think about this formula in the context of the basic vector space cosine measure and answer the following questions.. a) Give a motivation as to why idf term is being squared. b) What’s the likely role of the lengthNorm factor in this equation?2 5. (10 points) Describe two ways to speed up the cosine algorithm on the attached page. For each of your suggested speedups, characterize the impact that your method has on the output of the algorithm. 6. (5 points) What aspect of ad hoc retrieval system performance is relevance feedback primarily intended to address?3 7. Lucene scoring equation Cosine
View Full Document