New version page

FSU COP 4342 - Regular expressions and case insensitivity

Upgrade to remove ads

This preview shows page 1-2-3-27-28-29 out of 29 pages.

Save
View Full Document
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 29 pages.
Access to all documents
Download any document
Ad free experience

Upgrade to remove ads
Unformatted text preview:

Fall 2006 Perl 06Regular expressions and case insensitivityAs previously mentioned, you can make m atching caseinsensitive with the i flag:/\b[Uu][Nn][Ii][Xx]\b/; # explicitly giving case folding/\bunix\b/i; # using ‘‘i’’ flag to fold codeCOP 4342Fall 2006 Perl 06Really matching any character with “.”As mentioned before, usually the “.” (dot, period, fullstop) matches any character except newline. You make itmatch newline with the s flag:/"(.|\n)*"/; # match any quoted string, even with newlines embedded/"(.*)"/s; # same meaning, using ‘‘s’’ flagN.B. – I like to use the flags ///six; as a personaldefault set of flags with Perl regular expressions.COP 4342Fall 2006 Perl 06Going global with the ‘‘g’’ flagYou can make your matching global with the g flag. Forordinary matches, this means making them stateful: Perlwill remember where you left off with each reinvocationof the match unless you change the value of the variable,which will reset the match.COP 4342Fall 2006 Perl 06Going global with the ‘‘g’’ flag#!/usr/bin/perl -w# 2006 09 29 - rdl Script36.pl# shows the //g as stateful...while(<>){while(/[A-Z]{2,}/g){print "$&\n" if (defined($&));}}COP 4342Fall 2006 Perl 06Interpolating variables in patternsYou can even specify a variable inside of a pattern – butyou want to make sure that it gives a legitimate regularexpression.COP 4342Fall 2006 Perl 06Interpolating variables in patternsmy $var1 = "[A-Z]*";if( "AB" =~ /$var1/ ){print "$&";}else{print "nopers";}# yieldsABCOP 4342Fall 2006 Perl 06Regular expressions and substitution+ The s/.../.../ form can be used to make substitutionsin the specified string.+ If paired delimiters are used, then you have to use twopairs of the delimiters.+ g after the last delimiter indicates to replace more thanjust the first occurrence.COP 4342Fall 2006 Perl 06+ The substitution can be bound to a string using =~ .Otherwise it makes the substitutions in $_.+ The operation returns the number of replacementsperformed, which can be more than one with the ’g’option.COP 4342Fall 2006 Perl 06Examples#!/usr/bin/perl -w# 2006 09 29 - rdl Script37.pl# shows s///g... by removing acronymsuse strict;while(<>){s/([A-Z]{2,})//g;print;}COP 4342Fall 2006 Perl 06Exampless/\bfigure (\d+)/Figure $1/ # capitalize references to figuress{//(.*)}{/\*$1\*/} # use old style C commentss!\bif(!if (! # put a blank between if and (s(!)(.) # tone down that messages[!][.]g # replace all occurrences of ’!’ with ’.’COP 4342Fall 2006 Perl 06Case shiftingYou can use \U and \L to change follows them to upperand lower case:COP 4342Fall 2006 Perl 06Case shifting$text = " the acm and the ieee are the best! ";$text =~ s/acm|ieee/\U$&/g;print "$text\n";# yieldsthe ACM and the IEEE are the best!COP 4342Fall 2006 Perl 06Case shifting$text = "CDA 1001 and COP 3101are good classes, but COP 4342 is better!";$text =~ s/\b(COP|CDA) \d+/\L$&/g;print "$text\n";# yieldscda 1001 and cop 3101are good classes, but cop 4342 is better!COP 4342Fall 2006 Perl 06Using tr/// (also known as y///)+ In Perl you can also convert one set of characters toanother using the tr/.../.../ form. (Or if you like,you can use y///.)+ Much like the program tr, you specify two lists ofcharacters, the first to be substituted, and the secondwhat to substitute.COP 4342Fall 2006 Perl 06+ tr returns the number of items substituted (or deleted.)+ The modifer d deletes characters not replaced.+ The modifer s “squashes” any repeated characters.COP 4342Fall 2006 Perl 06Examples (from the perlop man page)$ARGV[1] =~ tr/A-Z/a-z/; # canonicalize to lower case$cnt = tr/*/*/; # count the stars in $_$cnt = $sky =~ tr/*/*/; # count the stars in $sky$cnt = tr/0-9//; # count the digits in $_COP 4342Fall 2006 Perl 06More examples# get rid of redundant blanks in $_tr/ //s;# replace [ and { with ( in $text$text =~ tr/[{/(/;COP 4342Fall 2006 Perl 06Using splitThe split function breaks up a string according toa specified separator pattern and generates a list of thesubstrings.COP 4342Fall 2006 Perl 06Using substringFor example:$line = " This sentence contains five words. ";@fields = split / /, $line;map { print "$count --> $fields[$count]\n"; $count++; } @fields;# yields-->1 --> This2 --> sentence3 --> contains4 --> five5 --> words.COP 4342Fall 2006 Perl 06Using the join functionThe join function does the reverse of the splitfunction: it takes a list and converts to a string.However, it is different in that it doesn’t take a patternas its first argument, it just takes a string:@fields = qw/ apples pears cantaloupes cherries /;$line = join "<-->", @fields;print "$line\n";# yieldsapples<-->pears<-->cantaloupes<-->cherriesCOP 4342Fall 2006 Perl 06Filehandles[Also see man perlfaq5 for more detail on this subject.]A filehandle is an I/O connection between your processand some device or file. Perl output is buffered.Perl has three pre defined filehandles: STDIN, STDOUT,and STDERR.COP 4342Fall 2006 Perl 06FilehandlesUnlike other variables, you don’t declare filehandles.The convention is to use all uppercase letters for filehandlenames. (Especially important if you deal with anonymousfilehandles!)The open operator takes two arguments, a filehandlename and a connection (e.g. filename). The connectioncan start with ”< , > , or ”>> to indicate read, write, andappend access.COP 4342Fall 2006 Perl 06Examplesopen IN, in.dat ; # open in.dat for inputopen IN2, <$file ; # open filename in $file for inputopen OUT, >out.dat ; # open out.dat for outputopen LOG, >>log.txt ; # open log.txt to append outputCOP 4342Fall 2006 Perl 06Closing filehandlesThe close operator close s a filehandle. This causes anyremaining output data associated with this filehandle tobe flushed to the file.Perl automatically closes filehandles at the end of aprocess, or if you reopen it.COP 4342Fall 2006 Perl 06Examplesclose IN; # closes the IN filehandleclose OUT; # closes the OUT filehandleclose LOG; # closes the LOG filehandleCOP 4342Fall 2006 Perl 06Testing openYou can check the status of opening a file by examiningthe result of the open operation. It returns a true value ifit succeeded, and a false one if it failed.if (!open OUT, >out.dat ) {die Could not open out.dat. ;}COP 4342Fall 2006 Perl 06Using a filehandleOpen IN, <in.dat ;Open OUT, >out.dat ;$i = 1;while ($line = <IN>) {printf OUT %d: $line , $i;}Note that a comma is not used after the filehandle in aprint or printf statement.COP 4342Fall 2006 Perl 06Reopening a


View Full Document
Download Regular expressions and case insensitivity
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Regular expressions and case insensitivity and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Regular expressions and case insensitivity 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?