DOC PREVIEW
UW CSE 303 - Lecture Notes

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

'&$%CSE 303:Concepts and Tools for Software DevelopmentDan GrossmanWinter 2006Lecture 5— Regular Expressions (and more), grep, other utilitiesDan Grossman CSE303 Winter 2006, Lecture 5 1'&$%Where are We• We are done learning this bizarre pseudo-programming languagecalled the shell.• Today: Specifying string patterns for many utilities, particularlygrep and sed.• Monday: Homework 1 due, no class• Wednesday: sed– needed in one place for homework 2– could do that one part manually for now (?)• Friday: We start learning C.Note: Start homework 2 early.Dan Grossman CSE303 Winter 2006, Lecture 5 2'&$%Globbing vs. Regular Expressions vs. ...“Globbing” refers to filename expansion characters.“Regular expressions” are a different but overlapping set of rules forspecifying patterns to programs like grep. (Somet imes called “patternmatching”.)More distinctions:• Regular expressions a la CSE322• “Regular expressions” in grep• “Regular expressions” in egrep (same as grep -E)• More subtle distinctions per program...Dan Grossman CSE303 Winter 2006, Lecture 5 3'&$%Real Regular ExpressionsSome of the crispest, elegant, most useful CS theory out there.What computer scientists know and ill-educated hackers don’t (totheir detriment).A regular expression p may “match” a string s. If p =• a, b, ... matches the single character• p1p2, ... if we can write s as s1s2, p1matches s1, p2matchess2.• p1|p2, ... if p1matches s or p2matches s (in egrep, for grepuse \|)• p1∗, if there is an i ≥ 0 such that p1. . . p1| {z }imatches s.(for i = 0, matchines the zero-c haracter s tring).Lots of examples with egrep.Dan Grossman CSE303 Winter 2006, Lecture 5 4'&$%Why this language?Amazing facts (see 322):• Exactly the patterns that can be found by a program that can saybefore it sees its input how much space it needs. (Decide if a 1GBstring has a substring that matches...)• You can write a program that takes two regular expressions anddecides if one matches every string the other does.• ... see CSE322Dan Grossman CSE303 Winter 2006, Lecture 5 5'&$%ConveniencesLots of “conveniences” do not make the language any more powerful:• p1+ is just p1p1∗• p1? is just (|p1)• [zd-h] is just z | d | e | f | g | h• [^A-Z] and . are long but technically just conveniences.• p1{n} is just p1. . . p1| {z }n• p1{n,} is just p1. . . p1| {z }np1∗• p1{n, m} is just p1. . . p1| {z }np1? . . . p1?| {z }mDan Grossman CSE303 Winter 2006, Lecture 5 6'&$%Beginning and endReally grep is matching each line against .*p.*.You c an say that is not what you want with ^ (beginning) and $ (end)or both (match whole line exactly).I can’t think of a good reason to put these characters in the m iddle ofa pattern, but you can.Fundamentally, we are still in the realm of “real” regular expressions.Dan Grossman CSE303 Winter 2006, Lecture 5 7'&$%Nasty gotchas• Special characters for one program not s pecial for another.• For example, I found \{ for grep but { for egrep.• Must quote your patterns so the s hell does not muck with them –and use single quotes if they contain $.• Must escape special characters with \ if you need them lite rally:\. and . are very different.– But inside [] less quoting (so backslash bec ome s lite ral)!Dan Grossman CSE303 Winter 2006, Lecture 5 8'&$%Previous matches• Up to 9 times in a pattern, you can group with (p) and refer tothe matched text later! (Need backslashes in sed.)• You can refer to the text (most recently) matched by the nthonewith \n.• Simple example: double-words ^\([a-zA-Z]*\)\1$• You cannot do this w ith regular expressions; the program mustkeep the previous strings.– Espec ially useful with sed because of substitutions.Dan Grossman CSE303 Winter 2006, Lecture 5 9'&$%Other UtilitiesSome very useful programs you can learn on your own:find (search for files, e.g., find /usr -name words)diff (compare two files’ contents, output is easy for humans andprograms to read (see all patch))wc (word-count (also characters and lines))Also:For many programs the -r flag makes them recursive (apply to allfiles, subdirectories, subsubdirectories, ...).Examples: chmod, cp, diff, rm.So “delete everything on the computer” is cd /; rm -rf *(be careful!)Dan Grossman CSE303 Winter 2006, Lecture 5


View Full Document

UW CSE 303 - Lecture Notes

Documents in this Course
Profiling

Profiling

11 pages

Profiling

Profiling

22 pages

Profiling

Profiling

11 pages

Testing

Testing

12 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?