DOC PREVIEW
Princeton COS 333 - Advanced Programming Techniques

This preview shows page 1-2-3-4 out of 13 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 13 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1COS 333: Advanced Programming Techniques• Brian Kernighan– bwk@cs, www.cs.princeton.edu/~bwk– 311 CS Building– 609-258-2089 (but email is always better)• TA's:– Chris DeCoro, cdecoro@cs, CS 103B, 258-0944 – Aquinas Hobor, aahobor@cs, CS 214, 258-1793 • Today– course overview– administrative stuff– regular expressions and grep• Check out the course web page (CS, not Blackboard!)– notes, readings and assignments will be posted there– Assignment 1 is posted– project information is posted• Do the survey if you haven't alreadyThemes• languages– C, Java, AWK, Perl, C++, Visual Basic, C#, ...– programmable tools, application-specific languages• tools – where did they come from and whyhow they have evolved, mutated, decayed– how to use them –how they work– how to build your own• programming– design, interfaces, patterns– reuse, theft, prototyping, components– programs that write programs– portability, standards, style– debugging, testing– performance assessment and improvement– tricks of the trade– tradeoffs, compromises, engineering• history and culture of programming2(Very) Tentative OutlineFeb 1 regular expressions; grepFeb 8 scripting languages: Awk & PerlFeb 15 more scripting: Perl, PHP(?), CGI Feb 22 Java; object-oriented programmingMar 1 networking; MySQL; projectMar 8 user interfaces, SwingMar 15 (spring break)Mar 22 C++Mar 29 C++, Standard Template LibraryApr 5 Visual Basic; COM, componentsApr 12 XML, web services; .net, C# Apr 19 Tcl/Tk?, language tools? Apr 26 ?May 4-5 project presentationsSome Mechanics• prerequisites– C, Unix (COS 217)• 5 programming assignments in first half– posted on course web page– deadlines matter• group project in second half– groups of 3-4; start identifying potential teammates– details in a few weeks – deadlines matter• monitor the web page– readings for most weeks– notes generally posted ahead of time• class attendance and participation <=> no midterm or final– sporadic unannounced short quizzes are possible3Regular expressions and grep• regular expressions–notation–mechanization– pervasive in Unix tools– not in most general-purpose languagesthough common in scripting languages and (some) editors– basic implementation is remarkably simple– efficient implementation requires theory and practice• grep is the prototypical tool– people used to write programs for searching (or did it by hand)–tools became important– tools are not as much in fashion todayGrep regular expressionsc any character matches itself, except for metacharacters. [ ] ^ $ * + \r1r2matches r1followed by r2. matches any single character[...] matches one of the characters in set ...a set like a-z or 0-9 includes any character in the range[^...] matches one of the characters not in seta set like a-z or 0-9 includes any char in the range^ matches beginning of line when ^ beginspatternno special meaning elsewhere in pattern$ matches end of line when $ ends patternno special meaning elsewhere in pattern* any regular expression followed by * matches zero or more instances\c matches c unless c is ( ) or digit\(...\) tagged regular expression that matches ...the matched strings are available as \1, \2, etc.4Examples of matchingthingthinganywhere in string ^thingthingat beginning of stringthing$thingat end of string^thing$ string that contains only thing^$ empty string. non-empty, i.e., at least 1 char^ matches any string, even emptything.$thingplus any char at end of string thing\.$thing. at end of string\\thing\\ \thing\ anywhere in string[tT]hingthingor Thinganywhere in stringthing[0-9]thingfollowed by one digit thing[^0-9]thingfollowed by a non-digit thing[0-9][^0-9]thingfollowed by digit, then non-digit thing1.*thing2thing1then any text then thing2^thing1.*thing2$thing1at beginning andthing2at end egrep: fancier regular expressionsr+ one or more occurrences of rr? zero or one occurrences of rr1|r2r1or r2(r) r (grouping)([0-9]+\.?[0-9]*|\.[0-9]+)([Ee][-+]?[0-9]+)?5Grammar for egrep regular exprsr:c. ^ $ [ccc] [^ccc]r* r+ r? r1 r2r1|r2(r)Precedence:* + ? are higher than concatenationwhich is higher than |([0-9]+\.?[0-9]*|\.[0-9]+)([Ee][-+]?[0-9]+)?The grep family• grep–basic matching• egrep– fancier regular expressions– trades compile time and space for run time• fgrep– parallel search for many fixed strings• agrep– "approximate" grep: search with errors permitted• relatives that use similar regular expressions– ed original unix editor– sed stream editor– vi, emacs, sam, ... editors– lex lexical analyzer generator – awk, perl, tcl, python, … scripting languages– Java, C# ... libraries in mainstream languages• simpler variants– filename "wild cards" in Unix and other shells– "LIKE" operator in Visual Basic, SQL, etc.6Basic grep algorithmwhile (get a line)if match(regexpr, line)print line• (perhaps) compile regexpr into an internal representation suitable for efficient matching• match() slides the regexpr along the input line,looking for a match at each pointregexprlineGrep (TPOP, p 226)/* grep: search for regexp in file */int grep(char *regexp, FILE *f, char *name){int n, nmatch;char buf[BUFSIZ];nmatch = 0;while (fgets(buf, sizeof buf, f) != NULL) {n = strlen(buf);if (n > 0 && buf[n-1] == '\n')buf[n-1] = '\0';if (match(regexp, buf)) {nmatch++;if (name != NULL)printf("%s:", name);printf("%s\n", buf);}}return nmatch;}7Match anywhere on a line• look for match at each position of text in turn/* match: search for regexp anywhere in text */int match(char *regexp, char *text){if (regexp[0] == '^')return matchhere(regexp+1, text);do { /* must look even if string is empty */if (matchhere(regexp, text))return 1;} while (*text++ != '\0');return 0;}Match starting at current position/* matchhere: search for regexp at beginning of text */int matchhere(char *regexp, char *text){if (regexp[0] == '\0')return 1;if (regexp[1] == '*')return matchstar(regexp[0], regexp+2, text);if (regexp[0] == '$' && regexp[1] == '\0')return *text == '\0';if (*text!='\0' && (regexp[0]=='.' || regexp[0]==*text))return matchhere(regexp+1, text+1);return 0;}• follow the easy case first: no metacharacters• note that this is recursive– maximum depth: one level for each regexpr character that matches8Matching * (repetitions)• matchstar() called to match c*... • matches if rest of regexpr matches rest of input– null matches


View Full Document

Princeton COS 333 - Advanced Programming Techniques

Download Advanced Programming Techniques
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Advanced Programming Techniques and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Advanced Programming Techniques 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?