DOC PREVIEW
Penn CIS 399 - Python Programming Handout

This preview shows page 1-2-23-24 out of 24 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CSE 399-004, Spring 2006Python ProgrammingHandout 4www.seas.upenn.edu/~cse39905Course Plan•Part I: Fundamentals•Syntax•Data Structures•Basic Functional Programming•Part II•Regular Expressions•Object-Oriented Programming•Lazy Functional Programming•Part III: Special Topics•AI Search•Comparisons w/ Ruby/OCaml/Scheme•Unicode and Multilinguality2Today•Command-line Arguments•Basic Networking•String Formatting3Command-Line ArgumentsSimple Arguments•sys.argv5C: void main (int argc, char **argv)Java: public static void main (String argv[])import sysfor arg in sys.argv: print argtest.py$ python test.py test.py$ python test.py abc def test.pyabcdef$ python test.py --help test.py--help$ python test.py -m kant.xml test.py-mkant.xmlargv[0] is always the program itself(like C, but unlike Java)Optional Arguments•getopt6turnin -c cse39905 -p hw1 a.py b.py>>> import getopt>>> arglist = '-a -b -cfoo -d bar a1 a2'.split()>>> arglist['-a', '-b', '-cfoo', '-d', 'bar', 'a1', 'a2']>>> opts, args = getopt.getopt(arglist, 'abc:d:')>>> opts[('-a', ''), ('-b', ''), ('-c', 'foo'), ('-d', 'bar')]>>> args['a1', 'a2']>>> getopt.getopt("a b -c d".split(), "abc:d")([], 'a b -c d')>>> getopt.getopt("-a b -c".split(),"a:c:")getopt.GetoptError: option -c requires argumentoptionsadditionalargumentsalways precedeLong Option Names7>>> s = '--condition=foo --testing --output-file \... abc.def -x a1 a2'>>> args = s.split()>>> args['--condition=foo', '--testing', '--output-file', 'abc.def', '-x', 'a1', 'a2']>>> optlist, args = getopt.getopt(args, 'x', ... ['condition=', 'output-file=', 'testing'])>>> optlist[('--condition', 'foo'), ('--testing', ''), ('--output-file', 'abc.def'), ('-x', '')]>>> args['a1', 'a2']Typical Example8import getopt, sysdef main(): try: opts, args = getopt.getopt(sys.argv[1:], "ho:v", \ ["help", "output="]) except getopt.GetoptError: # print help information and exit: usage() sys.exit(2) output = None verbose = False for o, a in opts: if o == "-v": verbose = True if o in ("-h", "--help"): usage() sys.exit() if o in ("-o", "--output"): output = a # ...if __name__ == "__main__": main()Basic NetworkingFetching a Web Page10>>> import urllib2>>> url = 'http://tycho.usno.navy.mil/cgi-bin/timer.pl'>>> for line in urllib2.urlopen(url):... if 'EDT' in line:... print line... <BR>Apr. 02, 10:27:28 AM EDT US Naval Observatory Master Clock TimeApr. 03, 14:27:28 UTC Apr. 03, 10:27:28 AM EDT Apr. 03, 09:27:28 AM CDT Apr. 03, 08:27:28 AM MDT Apr. 03, 07:27:28 AM PDT Apr. 03, 06:27:28 AM AKDT Apr. 03, 04:27:28 AM HASTTime Service Department, US Naval Observatory<html><body><TITLE>What time is it?</TITLE><H2> US Naval Observatory Master Clock <BR>Apr. 03, 14:27:28 UTC<BR>Apr. 03, 10:27:28 AM EDT<BR>Apr. 03, 09:27:28 AM CDT<BR>Apr. 03, 08:27:28 AM MDT<BR>Apr. 03, 07:27:28 AM PDT<BR>Apr. 03, 06:27:28 AM AKDT<BR>Apr. 03, 04:27:28 AM HAST</H3></B><P><A HREF="http://tycho.usnoObservatory</A></body></html>Sending Emails11>>> import smtplib>>> server = smtplib.SMTP('smtp.seas.upenn.edu')>>> server.sendmail('[email protected]','[email protected]', "hi"){}>>> server.quit()Six Degrees of Separation•HW 3 Problem 1, involving•parsing HTMLs•using regular expressions•depth-first search•Input: command-line arguments:•[-d max] [-h] [--help] URL1 URL2•default max is 6•Output: •shortest-path within max links, or •“unreachable within max links”12PennDirectoriesPersonal PagesSEAS Personal Pagesmy homepagewww.cis.upenn.edu/~lhuang3James W.www.seas.upenn.edu/~jswalkerRegular Expressionspart of this is based on “Regular Expression Howto”http://www.amk.ca/python/howto/regex/String Pattern Matching•Unix command ls *.txt or ls hw?.p*•Python Regular Expression is different: (), *, +, ?•*: repeating 0 or more times•ab*d matches ad, abd, abbd, ...•a(bcd)*d matches ad, abcdd, abcdbcdd, ...•+: repeating 1 or more times•ab+d matches abd, abbd, ...•a(bcd)+d matches abcdd, abcdbcdd, ...•?: 0 or 1 times14Character Class• | means “or”: (aa|bb) matches aa or bb•[abc] matches a, b, or c•or simply [a-c]•equivalent to (a|b|c) •[abc]+ matches a, b, c, aa, ab, ac, ba, bb, bc, ca, cb, cc, ...•[^5] matches any char except 5•[^0-9] matches any char except a digit•a[bcd]* matches many more than a(bcd)*15Matching is Greedy•by “matching” we mean matching the beginning portion of a string•a(bc)+ matches the underlined part in abcbcd•greedy search with backtracking•a(bcd)*b matches abcdb, abcdbcd, abcd•try match the pattern a[bcd]*b with string abcbd16Escape Characters•characters with special meanings•. ^ $ * + ? { } [ ] \ | ( )• \(ab\) matches (ab)• . matches any single character• .* matches any string• \\ matches \• ^ matches the beginning of a line or string• not the ^ inside char-classes [^...]• $ matches the end of a line or string•a[bcd]*b$ does not match string abcbd17Special Char Classes18[\s,.] matches any white spaces, “,” , or “.”\b means word-boundary (zero-length): \b\w+\b matches a single word (actually \b\w+ is enough)Performing Matches19>>> import re>>> re.match('[a-z]+', "")None>>> p = re.compile('[a-z]+')>>> p<_sre.SRE_Pattern object at 80c3c28>>>> p.match("")>>> print p.match("")None>>> m = p.match( 'tempo')>>> print m<_sre.SRE_Match object at 80c4f68>compiled version is faster for repeated use•match() returns None if failed, or a matched objectMatch vs. Search•match() determines if pattern matches at the beginning of a string•search() scans through the string to see if any substring matches20>>> print p.match('::: message')None>>> m = p.search('::: message')>>> print m<re.MatchObject instance at 80c9650>>>> m.group()'message'>>> m.span()(4, 11)findall() vs. finditer()•findall() returns a list of all substrings that matches•finditer() returns an iterator of matched objects21>>> p = re.compile('\d+')>>> s = '12 drummers, 11 pipers, 10 lords'>>> p.findall()['12', '11', '10']>>> iterator = p.finditer(s)>>> iterator<callable-iterator object at 0x401833ac>>>> for match in iterator:... print match.span()...(0, 2)(13, 15)(24, 26)Groups22>>> import re>>> p = re.compile(r'(\w+)\s+(\d+)')>>> s = " I teach cse 399 and cis 500. ">>> p.findall(s)[('cse', '399'), ('cis', '500')]>>> for m in p.finditer(s):... print m.group(), m.groups()... cse


View Full Document

Penn CIS 399 - Python Programming Handout

Download Python Programming Handout
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Python Programming Handout and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Python Programming Handout 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?