Week 7ListsSpecial thanks to Roy McElmurry, John Kurkowski, Scott Shawcroft, Ryan Tucker, Paul Beck for their work.Except where otherwise noted, this work is licensed under:http://creativecommons.org/licenses/by-nc-sa/3.02Lists• list: Python's equivalent to Java's array (but cooler)– Declaring:name = [value, value, ..., value] or,name = [value] * length– Accessing/modifying elements: (same as Java)name[index] = value>>> scores = [9, 14, 18, 19, 16][9, 14, 18, 19, 16]>>> counts = [0] * 4[0, 0, 0, 0]>>> scores[0] + scores[4]253Indexing• Lists can be indexed using positive or negative numbers:15247161912149valueindex 0 1 2 3 4 5 6 7index -8 -7 -6 -5 -4 -3 -2 -1>>> scores = [9, 14, 12, 19, 16, 7, 24, 15][9, 14, 12, 19, 16, 7, 24, 15]>>> scores[3]19>>> scores[-3]74Recall: Strings• Accessing character(s):variable [ index ]variable [ index1:index2 ]– index2 exclusive– index1 or index2 can beomitted (goes to end of string)-1-2-3-4-5-6-7-8-indexvalueindex 2yddiD.P7654310>>> name = "P. Diddy">>> name[0]'P'>>> name[7]'y'>>> name[-1]'y'>>> name[3:6]'Did'>>> name[3:]'Diddy'>>> name[:-2]'P. Did'5Slicing• slice: A sub-list created by specifying start/end indexesname[start:end] # end is exclusivename[start:] # to end of listname[:end] # from start of listname[start:end:step] # every step'th value>>> scores = [9, 14, 12, 19, 16, 18, 24, 15]>>> scores[2:5][12, 19, 16]>>> scores[3:][19, 16, 18, 24, 15]>>> scores[:3][9, 14, 12]>>> scores[-3:][18, 24, 15]152418161912149valueindex 0 1 2 3 4 5 6 7index -8 -7 -6 -5 -4 -3 -2 -16Other List Abilities– Lists can be printed (or converted to string with str()).– Find out a list's length by passing it to the len function.– Loop over the elements of a list using a for ... in loop.>>> scores = [9, 14, 18, 19]>>> print("My scores are", scores)My scores are [9, 14, 18, 19]>>> len(scores)4>>> total = 0>>> for score in scores:... print("next score:", score)... total += scorenext score: 9next score: 14next score: 18next score: 19>>> total607Ranges, Strings, and Lists• The range function returns a list.• Strings behave like lists of characters: – len– indexing and slicing– for ... in loops>>> nums = range(5)>>> nums[0, 1, 2, 3, 4]>>> nums[-2:][3, 4]>>> len(nums)58String Splitting• split breaks a string into a list of tokens.name.split() # break by whitespacename.split(delimiter) # break by delimiter• join performs the opposite of a splitdelimiter.join(list)>>> name = "Brave Sir Robin">>> name[-5:]'Robin'>>> tokens = name.split()['Brave', 'Sir', 'Robin']>>> name.split("r")['B', 'ave Si', ' Robin']>>> "||".join(tokens)'Brave||Sir||Robin'9Tokenizing File Input• Use split to tokenize line contents when reading files.– You may want to type-cast tokens: type(value)>>> f = open("example.txt")>>> line = f.readline()>>> line'hello world 42 3.14\n'>>> tokens = line.split()>>> tokens['hello', 'world', '42', '3.14']>>> word = tokens[0]'hello'>>> answer = int(tokens[2])42>>> pi = float(tokens[3])3.1410Exercise• Recall hours.txt. Suppose the # of days can vary:123 Susan 12.5 8.1 7.6 3.2456 Brad 4.0 11.6 6.5 2.7 12789 Jenn 8.0 8.0 8.0 8.0 7.5• Compute each worker's total hours and hours/day.– Should work no matter how many days a person works.Suzy ID 123 worked 31.4 hours: 6.3 / dayBrad ID 456 worked 36.8 hours: 7.36 / dayJenn ID 789 worked 39.5 hours: 7.9 / day11Exercise Answerfile = open("hours.txt")for line in file:tokens = line.split()id = tokens[0]name = tokens[1]# cumulative sum of this employee's hourshours = 0.0days = 0for token in tokens[2:]:hours += float(token)days += 1print(name, "ID", id, "worked", \hours, "hours:", hours / days, "/ day")123456789101112131415hours.py12Exercise• Suppose we have a file of midterm scores, scores.txt:7689767268• Create a histogram of the scores as follows:75: *76: *****79: **81: ********82: ******84: ***********13Exercise• Suppose we have Internet Movie Database (IMDb) data:1 9.1 196376 The Shawshank Redemption (1994)2 9.0 139085 The Godfather: Part II (1974)3 8.8 81507 Casablanca (1942)• Write a program to search for all films with a given phrase:Search word? partRank Votes Rating Title2 139085 9.0 The Godfather: Part II (1974)40 129172 8.5 The Departed (2006)95 20401 8.2 The Apartment (1960)192 30587 8.0 Spartacus (1960)4 matches.14Exercise Answersearch_word = input("Search word? ")matches = 0file = open("imdb.txt")for line in file:tokens = line.split()rank = int(tokens[0])rating = float(tokens[1])votes = int(tokens[2])title = " ".join(tokens[3:])# does title contain search_word?if search_word.lower() in title.lower():matches += 1print(rank, "\t", votes, "\t", rating, "\t", title)print(matches,
View Full Document