UCF COT 3100 - Strings and Languages - D2931192

Home> Schools> University of Central Florida> Computer Theory (COT) > COT 3100> Strings and Languages

UCF COT 3100 - Strings and Languages

School name University of Central Florida

Course Cot 3100- Introduction to Discrete Structures

Pages 7

Download Save

Unformatted text preview:

Strings and Languages A string is simply the concatenation of several letters in an alphabet Typically we will define an alphabet as a set So for example we could have a b Then a string over this alphabet would be any word formed with only the letters or characters a and b There is no limit on the length of a string It must simply be a non negative integer This means there is a string of length of length 0 This is known as the empty string The empty string is typically denoted by In particular if you concatenate the empty string with any other string you get back that string It seems silly to have an empty string but it will help out in certain situations Make sure you do not confuse the empty string with the empty set Also recognize that can never be a letter of an alphabet In the book they define non empty strings in the following manner n is a string of length n over the alphabet and is defined as below 1 1 2 n 1 xy x y n And any string is simply a subset of n where n is a positive integer Using the book s definition we can define the following two sets n 1 to n or in English is the set of all strings of positive length over an alphabet n 0 to n or in English is the set of all strings of over an alphabet Now that we have these definitions we can define a language A language L over an alphabet is any subset of If a string w L then we say that the string w belongs in the language L Otherwise we say w does not belong in L For example horse English but gjtysihg English to the best of my knowledge In this class we will focus on a specific type of language regular languages In particular a language is a regular one if and only if it can be expressed as a regular expression You can think of a regular expression as a mold For example when you are trying to find a file on your computer but don t know the EXACT name of the file you may enter Eric doc because you know that you are searching for a file whose name starts with Eric and is a Word document In this situation the string with a star in it is a mold applying to an infinite number of strings namely those that start with Eric and end in doc It may seem easy to see if a string fits a mold but there s more than meets the eye here Consider the mold a a When trying to mold the string banana you may match the first star to the b and the second star to the first n and conclude that the string doesn t fit the mold BUT this is not the case because we can match the second star to nan instead of just an n Thus if there is ANY way for a string to match a mold then that string is said to be described or specified by that mold Regular expressions are nothing but molds To determine if a particular string is an element of the language described by a regular expression you must see if there exists a way to make that string fit to the mold given by the regular expression Here are the rules for forming a regular expression R is a regular expression if R is 1 a for some a 2 3 4 R1 R2 where R1 and R2 are regular expressions 5 R1 R2 where R1 and R2 are regular expressions 6 R where R is a regular expression Each of these needs some explaining The first three are the building blocks or the atoms of regular expressions Only the correct character fits 1 while only an empty string a string of length 0 fits 2 and absolutely nothing fits 3 These three alone are quite boring and useless However we can create more complex expressions using rules 4 5 and 6 To decide if a string fits to the mold of R1 R2 you need to check if it fits to the mold of R1 OR if it fits to the mold of R2 If the answer to either question is yes then the string is a part of the language described by R1 R2 To decide if a string fits the mold of R 1 R2 is a bit more complicated Here is the rule Let w be a string If there exists a way to split up w such that w xy the concatenation of strings x and y where x is in the language described by R1 and y is in the language described by R2 then w is in the language described by R 1 R2 Once again just because one way to split up the string doesn t work so to speak doesn t mean that the string is NOT in the language Take this example Consider the language described by the regular expression a baa bba bb The string baabb is in the language in spite of the fact that the string b does NOT match a baa and the string aabb does not match bba bb Finally the most difficult rule of all is the Kleene star rule A string w is in the language described by R if w x 1 x2 xn for some non negative integer n where xi is a string in the language described by R for 1 i n for some set of xis The difficulty here is that for a string to be in the language there must simply be some way to break up the string into smaller pieces that are all described by the mold R There are WAY TOO MANY ways to split up a string for us to really check them all Thus we must be careful before we claim that a particular string is NOT in the language described by some regular expression especially ones with s in them So for example here are some regular languages 1 Any finite language You can form each string in the language by concatenating elements from together and you can put these all together with to form the corresponding regular expression 2 ab n n 0 the corresponding regular expression is ab This language contains the strings ab abab ababab etc 3 The regular expression for the set of all languages that end in bb is a b bb One thing to keep in mind is that we will be using letters to stand for BOTH elements of the alphabet AND languages themselves I will try to stick to the convention that a lowercase letter signifies an element of an alphabet where as an UPPERCASE letter signifies an entire language You can essentially treat regular expressions identical to regular languages The basic difference is that a language should be written out in a set format but a regular expression is not The terminology often used is that …

View Full Document


School:
Email:
New Password:
Confirm Password:

UCF COT 3100 - Strings and Languages

Sign up for free to view:

Please select your school