By Andrew Cory Grouping Things Hierarchical Matching Grouping characters and Allows parts of a regular expression to be treated as a single unit Useful for the creation of multiple words and or phrases with similar base characters and or words Ex house cat keeper housecat housekeeper Ex a bc d ad bd or cd Ex 19 20 d d matches 19xx 20xx or xx Continued Backtracking step by step process of trying alternatives and seeing if they match and moving on to the next alternative if it doesn t Any given regular expression has several paths that result in a different string Backtracking is a trial and error method that goes through one character at a time Continued Backtracking Example abcd af ab ce c cd 1 start with letter a 2 try 1st alternative 3 a matches but f doesn t match b backtrack to a and try 2nd alternative 4 a and b matches the first 2 characters first group satisfied next group 5 c matches but e doesn t backtrack to c try 2 nd alt 6 c matches second group is satisfied therefore whole expression is satisfied by abcd Note 3rd alt in the 2nd group matches too but is irrelevant the string already satisfied the regular expression Extracting Matches Parentheses not only group they also extract and separate parts of strings that match the given condition I e if time d d d d d d hours 1 minutes 2 seconds 3 hours minutes second time d d d d d d Continued Nested grouping in a regular expression results in more separation Ex ab cd ef gi j 1 ab 2 cd ef 3 gi j 4 gi Backreferences related to matching variables 1 2 etc but can only be used inside the regular expression Useful for repeating phrases Ex w w w 1 booboo or murmur Continued Positions of string portions that match the conditions are also stored in the and arrays Ex x Mmm donut x Mmm donut Foreach expr 1 print expr expr at expr expr n Output 1 Mmm at 0 3 2 donut at 6 11 Continued Strings that have no groupings but are still searched for are still stored in separate variables is the string before the match is the string that matched is the string after the match Ex x I like chips x like I like chips Matching Repetitions Quantifier characters and are used to match words or syllables of any length without massive amounts of repetition Definitions a matches a one or zero times a matches a any number of times a matches a one or more times at least once a n m matches at least n times not more than m times a n matches at least n or more times a n matches exactly n times Continued Examples a z s d a lowercase word some space and any number of digits ajc 93 jgro 843986 w s 1 a doubled word of any length with a space inbetween jon jon hidalgo hidalgo y es i y Y or yes Continued Perl will always try to match as much of a given string as possible to a regular expression so long as the regular expression holds true I e the operator will be matched to the string with whatever precursor present if not it stops using it Ex x the cat in the hat x at 1 the cat in the h 2 at 3 Continued Quantifiers that grab as much of the string as possible are known as maximal match or greedy quantifiers 4 important regular expression principles Principle 1 any regexp will be matched at the earliest possible position in the string Principle 2 The leftmost alternation that matches in a group will be the one used a b c Principle 3 Matching quantifiers will match as much of the string as possible while holding true to the regexp Principle 4 The leftmost greedy quantifier has more priority over other existing greedy quantifiers Continued Examples x The programming republic of Perl x e r 1 The programming republic of Pe 2 r 3 l x m 1 2 1 m 2 ing republic of Perl Continued Sometimes returning the minimal piece of a string is essential thus minimal match or non greedy quantifiers and were created Definitions a match a 0 or 1 times 0 first then 1 a match a any number of times as few as possible a match a 1 or more times as few as possible a m n match n times no more than m as few as pos a n match n times as few as possible a n match n times same thing as a n Continued Examples same as above different operators x The programming republic of Perl x e r 1 Th 2 e 3 programming republic of Perl x m 1 2 1 mm 2 ing republic of Perl Continued Note Principle 3 matching quantifiers may be manipulated for non greedy quantifiers so that the leftmost quantifier matches the least amount of the string as possible Continued Quantifiers are susceptible to backtracking Ex x the cat in the hat x at 1 the cat in the h 2 at 3 1 Start with the first letter t 2 The first quantifier starts matches whole string 3 a does not match the end of the string backtrack once 4 a does not match the last letter t backtrack once more 5 match a then the t 6 move on to the 3rd element Already at the end of the string assign it as an empty string Continued Error alert Nested indeterminable quantifiers are dangerous things Ex a b In the above example the first repetitions searches with b of whatever length up to infinite and then again searches with the thereafter with whatever length infinite If a match is not found early in the process Perl will attempt to find EVERY possibility before halting massive amount of memory used Building a Regexp Step one decide what we want to match and what we want to exclude Ex A regexp that matches numbers will reject any string and accept both integers and floating point s Step two break the problem down into smaller parts Smaller parts are easier to work with Ex Any integer d d represents a digit represents a number s sign positive negative Continued Ex Floating point Has a sign decimal point fractional part and an exponent i e 25 4E 72 d d d d d eE d 1st part is the sign of the number 2nd part d d d d d is the several different ways a floating point number can be 2 54 346 395 500 3rd part eE d is the exponential part which is represented by e or E followed by a sign then a decimal of any size e 5 E9000 Continued The x modifier in Perl allows one to write complex regexps with as much spacing as the programmer wants d d d d d eE d x Continued The downside to the x modifier certain symbols must be typed differently …
View Full Document
Unlocking...