DOC PREVIEW
DREXEL CS 265 - Regular Expressions in Perl

This preview shows page 1-2-3-4-5-6 out of 17 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Regular Expressions in Perl – Part IRegular ExpressionsSlide 3Slide 4MetacharactersEscape SequencesVariables in Regular ExpressionsAnchor MetacharactersCharacter ClassesSpecial Characters and Range Operators in Character ClassesNegation in a Character ClassCommon Character ClassesWord Anchor//s and //m ModifiersAlternation MetacharacterSourceQuestions?REGULAR EXPRESSIONS IN PERL – PART IWilliam FisherRegular ExpressionsA regular expression is a string that represents a pattern.Can be used to search strings, extract desired parts of a string, or do a search and replace operation on a string.A basic regular expression uses the =~ operator as follows:–"Hello World" =~ /World/; # matchesThis returns a true if the string contains the desired pattern and false if it does not. The return is reversed if the ‘=‘ is replaced with an ‘!’.–"Hello World" !~ /World/; # does not matchRegular ExpressionsA variable can also be used in a regular expression.–$greeting = "World“; "Hello World" =~ /$greeting/; # matchesWith the default variable $_, $_ =~ can be omitted.–$_ = "Hello World";–if (/World/)The default delimiter can be replaced by putting an m in front of the expression. The default delimiter can then be used as a normal character.–"Hello World" =~ m!World!;Regular Expressions•Regular expressions match exactly to a string so they are case sensitive and consider ‘ ‘ to be a character.–"Hello World" =~ /world/; # doesn't match–"Hello World" =~ /oW/; # doesn't match–"Hello World" =~ /World /; # doesn't match•A regular expression also always matches the first instance of the pattern.–"That hat is red" =~ /hat/; # matches 'hat' in 'That'MetacharactersMetacharacters can be used to make more complicated matches. The meta characters are: {}[]()^$.|*+?\Metacharacters are treated as regular characters if preceded by a backslash."The interval is [0,1)." =~ /\[0,1\)\./ # matches"/usr/bin/perl" =~ /\/usr\/bin\/perl/; # matchesEscape SequencesEscape sequences are ASCII characters with no printable character equivilant such as \n, \t, \n ,\r, \a. They can be included in regular expressions just like any other character.–"1000\n2000" =~ /0\n20/ # matchesA backslash followed by three digits represents an octal number and a backslash followed by a lower case x (\x) and two digits (from 0 – F) represents a hexadecimal number.–"cat" =~ /\143\x61\x74/ # matchesVariables in Regular ExpressionsVariables can be included in regular expressions, similarly to how strings work with regular double quoted strings in Perl.$foo = 'house'; 'housecat' =~ /$foo/; # matches 'cathouse' =~ /cat$foo/; # matches 'housecat' =~ /${foo}cat/; # matchesAnchor MetacharactersThe ^ and $ metacharacters can be used to be used to require the expression to match at the beginning and end of a string respectively. The $ matches even if there is a /n, at the end of the string.–"housekeeper" =~ /^keeper/; # doesn't match–"housekeeper" =~ /keeper$/; # matches–"housekeeper\n" =~ /keeper$/; # matchesWhen both are used requires the entire string matches the parameters.–"keeper" =~ /^keep$/; # doesn't match–"keeper" =~ /^keeper$/; # matches–"" =~ /^$/; # ^$ matches an empty stringCharacter ClassesCharacter classes matches a set of possible characters, which are contained within brackets. […]–/[bcr]at/; # matches 'bat, 'cat', or 'rat'–/item[0123456789]/; # matches 'item0' or ... or 'item9‘–/[yY][eE][sS]/; # match 'yes' in a case-insensitive wayAnother way to represent case-insensitivity is the //i operator.–/yes/iSpecial Characters and Range Operators in Character ClassesSpecial characters can also be used in character classes as they are used in other places.–$x = 'bcr'; /[$x]at/; # matches 'bat', 'cat', or 'rat' /[\$x]at/; # matches '$at' or 'xat'The range operator ‘-’ can be used to represent contiguous sets characters as ranges such as [0-9] or [a-z]–/item[0-9]/; # matches 'item0' or ... or 'item9‘The range operator is treated as an ordinary character if it is at the beginning or end of the character class.Negation in a Character ClassAn ^ at the beginning of a class means that the character can be anything but what is included in the class./[^a]at/; # doesn't match 'aat' or 'at', but matches # all other 'bat', 'cat, '0at', '%at', etc./[^0-9]/; # matches a non-numeric character/[a^]at/; # matches 'aat' or '^at'; here '^' is ordinaryCommon Character ClassesCertain common character classes have abbreviations.–\d represents [0-9]–\s represents [\\\t\r\n\f] (whitespace character)–\w represents [0-9a-zA-Z]–‘.’ represents any character except \n\D, \S, and \W represent the negation of the character classes of their lower-case equivalents.These abbreviations can be used inside or outside of character classes.A period must be escaped or put in a character class to be used as a normal characterWord AnchorThe character \b matches a boundary between a word character and a non-ward character.\w\W or \W\w$x = "Housecat catenates house and cat"; $x =~ /cat/; # matches cat in 'housecat' $x =~ /\bcat/; # matches cat in 'catenates' $x =~ /cat\b/; # matches cat in 'housecat' $x =~ /\bcat\b/; # matches 'cat' at end of string//s and //m ModifiersThe //s modifier treats the string as a single line and therefore the ‘.’ character class will include \n.–$x = "There once was a girl\nWho programmed in Perl\n";–$x =~ /girl.Who/s; # matches, "." matches "\n"The //m modifier makes the anchor metacharacters treat each line as a new string so that the match can be at the beginning or end of any line.–$x =~ /^Who/m; # matches, "Who" at start of second lineThese modifiers can be combined (//sm) to get both of these effects.When using the //m modifier \A and \Z can still be used to match the beginning and the end of the string (ignoring the final \n) respectively. \z matches the end and considers the \n.Alternation MetacharacterThe | metacharacter can be used to match more than one possible string. –"cats and dogs" =~ /cat|dog|bird/; # matches “dog“The order of the string still predominates.–"cats and dogs" =~ /dog|cat|bird/; # matches "cat”In cases where more then one apply, the first one is used.–"cats" =~ /c|ca|cat|cats/; # matches "c"–"cats" =~ /cats|cat|ca|c/; # matches "cats"SourceKvale, Mark. Perl regular


View Full Document

DREXEL CS 265 - Regular Expressions in Perl

Download Regular Expressions in Perl
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Regular Expressions in Perl and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Regular Expressions in Perl 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?