DOC PREVIEW
UMD CMSC 132 - Regular Expressions & Automata

This preview shows page 1-2-15-16-31-32 out of 32 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 32 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Regular Expressions Automata Fawzi Emad Chau Wen Tseng Department of Computer Science University of Maryland College Park Complexity to Computability Just looked at algorithmic complexity How many steps required At high end of complexity is computability Decidable Undecidable Complexity to Computability Approach complexity from different direction Look at simple models of computation Regular expressions Finite automata Turing Machines Overview Regular expressions Notation Patterns Java support Automata Languages Finite State Machines Turing Machines Computability Regular Expression RE Notation for describing simple string patterns Very useful for text processing Finding extracting pattern in text Manipulating strings Automatically generating web pages Regular Expression Regular expression is composed of Symbols Operators Concatenation AB Union A B Closure A Definitions Alphabet Set of symbols Examples a b A B C a z A Z 0 9 Strings Sequences of 0 or more symbols from alphabet Examples a bb cat caterpillar Languages empty string Sets of strings Examples a bb cat More Formally Regular expression describes a language over an alphabet L E is language for regular expression E Set of strings generated from regular expression String in language if it matches pattern specified by regular expression Regular Expression Construction Every symbol is a regular expression Example a REs can be constructed from other REs using Concatenation Union Closure Regular Expression Construction Concatenation A followed by B L AB ab a L A AND b L B Example a a ab ab Regular Expression Construction Union A or B L A B a a L A OR a L B Example a b a b Regular Expression Construction Closure Zero or more A L A a a OR a L A OR a L A L A Example a a aa aaa aaaa ab c c abc ababc abababc Regular Expressions in Java Java supports regular expressions In java util regex Applies to String class in Java 1 4 Introduces additional specification methods Simplifies specification Does not increase power of regular expressions Can simulate with concatenation union closure Regular Expressions in Java Concatenation ab ab c ab abc Union bar or square brackets a b abc a b a b c Closure star ab ab abab ababab ab a b aa ab ba bb Regular Expressions in Java One or more plus a One or more a s Range dash a z 0 9 Any lowercase letters Any digit Complement caret at beginning of RE a a z Any symbol except a Any symbol except lowercase letters Regular Expressions in Java Precedence Higher precedence operators take effect first Precedence order Parentheses Closure Concatenation Union Range a b ab a b Regular Expressions in Java Examples ab ab ab cd a b c d abc d ab abb abbb abbbb ab abab ababab ab cd abd acd ad bd cd When in doubt use parentheses Regular Expressions in Java Predefined character classes d D s S w W Any character except end of line Digit 0 9 Non digit 0 9 Whitespace character t n x0B f r Non whitespace character s Word character a zA Z 0 9 Non word character w Regular Expressions in Java Literals using backslash Need two backslash Java compiler will interpret 1st backslash for String Examples 4 backslashes interpreted as by Java compiler Using Regular Expressions in Java 1 Compile pattern import java util regex Pattern p Pattern compile a z 2 Create matcher for specific piece of text Matcher m p matcher Now is the time 3 Search text Boolean found m find Returns true if pattern is found in text Using Regular Expressions in Java If pattern is found in text m group string found m start index of the first character matched m end index after last character matched m group is same as substring m start m end Calling m find again Starts search after end of current pattern match If no more matches return to beginning of string Complete Java Example Code import java util regex public class RegexTest public static void main String args Pattern p Pattern compile a z Matcher m p matcher Now is the time while m find System out print m group Output ow is the time Language Recognition Accept string if and only if in language Abstract representation of computation Performing language recognition can be Simple Strings with even number of 1 s Hard Strings representing legal Java programs Impossible Strings representing nonterminating Java programs Automata Simple abstract computers Can be used to recognize languages Finite state machine States transitions Turing machine States transitions tape Finite State Machine States Starting Accepting Finite number allowed Transitions State to state Labeled by symbol a 0 1 1 q2 q1 Start State 0 Accept State L M w w ends in a 1 Finite State Machine Operations Move along transitions based on symbol Accept string if ends up in accept state Reject string if ends up in non accepting state 0 011 Accept 10 Reject 1 1 q2 q1 0 Finite State Machine Properties Powerful enough to recognize regular expressions In fact finite state machine regular expression Languages recognized by finite state machines Languages recognized by regular expressions 1 to 1 mapping Turing Machine Defined by Alan Turing in 1936 Finite state machine tape Tape Infinite storage Read write one symbol at tape head Move tape head one space left right 0 Tape Head 1 1 q2 q1 0 Turing Machine Allowable actions Read symbol from current square Write symbol to current square Move tape head left Move tape head right Go to next state Turing Machine Tape Head 1 0 0 1 0 Current State Current Content Value to Write Direction to Move New state to enter START Left MOVING MOVING 1 0 Left MOVING MOVING 0 1 Left MOVING MOVING No move HALT Turing Machine Operations Read symbol on current square Select action based on symbol current state Accept string if in accept state Reject string if halts in non accepting state Reject string if computation does not terminate Halting problem It is undecidable in general whether long running computations will eventually accept Computability Computability A language is computable if it can be recognized by some algorithm with finite number of steps Church Turing thesis Turing machine can recognize any language computable on any machine Intuition Turing machine captures essence of computing Program finite state machine Memory tape


View Full Document

UMD CMSC 132 - Regular Expressions & Automata

Documents in this Course
Notes

Notes

8 pages

Recursion

Recursion

12 pages

Sorting

Sorting

31 pages

HTML

HTML

7 pages

Trees

Trees

19 pages

HTML

HTML

18 pages

Trees

Trees

19 pages

Honors

Honors

19 pages

Lecture 1

Lecture 1

11 pages

Quiz #3

Quiz #3

2 pages

Hashing

Hashing

21 pages

Load more
Loading Unlocking...
Login

Join to view Regular Expressions & Automata and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Regular Expressions & Automata and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?