GT AE 6382 - Regular Expression HOWTO

Unformatted text preview:

1 Introduction2 Simple Patterns2.1 Matching Characters2.2 Repeating Things3 Using Regular Expressions3.1 Compiling Regular Expressions3.2 The Backslash Plague3.3 Performing Matches3.4 Module-Level Functions3.5 Compilation Flags4 More Pattern Power4.1 More Metacharacters4.2 Grouping4.3 Non-capturing and Named Groups4.4 Lookahead Assertions5 Modifying Strings5.1 Splitting Strings5.2 Search and Replace6 Common Problems6.1 Use String Methods6.2 match() versus search()6.3 Greedy versus Non-Greedy6.4 Not Using re.VERBOSE7 FeedbackRegular Expression HOWTORelease 0.05A.M. KuchlingApril 20, [email protected] document is an introductory tutorial to using regular expressions in Python with the re module. It provides agentler introduction than the corresponding section in the Library Reference.This document is available from http://www.amk.ca/python/howto.Contents1 Introduction 22 Simple Patterns 22.1 Matching Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.2 Repeating Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Using Regular Expressions 43.1 Compiling Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.2 The Backslash Plague . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.3 Performing Matches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.4 Module-Level Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.5 Compilation Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 More Pattern Power 94.1 More Metacharacters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.2 Grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114.3 Non-capturing and Named Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124.4 Lookahead Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Modifying Strings 155.1 Splitting Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155.2 Search and Replace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 Common Problems 186.1 Use String Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186.2 match() versus search() . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186.3 Greedy versus Non-Greedy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196.4 Not Using re.VERBOSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Feedback 201 IntroductionThe re module was added in Python 1.5, and provides Perl-style regular expression patterns. Earlier versions ofPython came with the regex module, which provides Emacs-style patterns. Emacs-style patterns are slightly lessreadable and don’t provide as many features, so there’s not much reason to use the regex module when writing newcode, though you might encounter old code that uses it.Regular expressions (or REs) are essentially a tiny, highly specialized programming language embedded inside Pythonand made available through the re module. Using this little language, you specify the rules for the set of possiblestrings that you want to match; this set might contain English sentences, or e-mail addresses, or TeX commands, oranything you like. You can then ask questions such as “Does this string match the pattern?”, or “Is there a match forthe pattern anywhere in this string?”. You can also use REs to modify a string or to split it apart in various ways.Regular expression patterns are compiled into a series of bytecodes which are then executed by a matching enginewritten in C. For advanced use, it may be necessary to pay careful attention to how the engine will execute a givenRE, and write the RE in a certain way in order to produce bytecode that runs faster. Optimization isn’t covered in thisdocument, because it requires that you have a good understanding of the matching engine’s internals.The regular expression language is relatively small and restricted, so not all possible string processing tasks can bedone using regular expressions. There are also tasks that can be done with regular expressions, but the expressionsturn out to be very complicated. In these cases, you may be better off writing Python code to do the processing; whilePython code will be slower than an elaborate regular expression, it will also probably be more understandable.2 Simple PatternsWe’ll …


View Full Document

GT AE 6382 - Regular Expression HOWTO

Download Regular Expression HOWTO
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Regular Expression HOWTO and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Regular Expression HOWTO 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?