Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 231Strings• A string over a set A is a finite sequence of elements from A.Definition. An alphabet is a nonempty, finite set of indivisible symbols. We are going to denote it by . • Any program is a string of keywords, variable names, and permissible symbols. • The set of elements from which the strings are built is called an alphabet. • A programming language should satisfy general rules (grammar) to be understood by computer (compiler). These rules are studied by the formal theory of programming languages.2Definition. A string w over an alphabet is a sequence of symbols, w = a1 a2… an, where each ai , 1 i n. • The number of symbols is called the length of the string, |w|=n.There is one special string that has zero length (contains no symbols).It is called the empty string and has special notation , .|w|=0 w = . is not an element of any alphabet, .Example. Let = {a, b, c}. Find all possible strings with length less or equal 3 built from . Length 0: Length 1: a b cLength 2: aa ab ac ba bb bc ca cb ccLength 3: aaa aab aac aba abb abc aca acb acc …3• Two strings u and v can be concatenated to form a singlestring uv, that consists of the symbols of string u, followed by symbols of of string v. The length | uv | = | u|+| v |.• u = u = u • Concatenation is associative: (uv)w = u(vw),but not commutative: uv vu (the order is important!).4Definition. Any set of strings over some alphabet is called a language. Alphabet itself is a language as well (the language of all one-symbol words). Since languages are sets, we can apply all set operations to languages: union, intersection and set difference. There is one operation specific for languages: concatenation of two languages : L1L2 ={uv | u L1 and v L2 } Q : L1L2 = L2L1 ? A: L1L2 L2L1 Examples: Set of all executable computer programs is a language.5Example. Take the alphabet = {a, b, c}. Consider two languagesover alphabet : L1 ={a, ab} and L2 ={b, bc, c}. Find L1L2 and L2L1.We need to take every string from L1 and concatenate with everystring from L2 . In this way we get |L1| |L2 | strings: ab, abc, ac, abb, abbc, abc.Note, that not all strings are distinct, like abc. L1L2 = {ab, abc, ac, abb, abbc} . The cardinality | L1L2 | is the number of distinct strings,resulting from concatenation . In general, | L1L2 | | L1| |L2 |and | L1L2 | | L2L1 | In the example | L1L2 | = 5< | L1| |L2 |=6. In the same way: L2L1 = {ba, bab, bca, bcab, ca, cab}.6In particular, we can consider the concatenation of an alphabet with itself: is the language of all two-symbol words. Notation: = 2 What is 2? Similarly, 3 = 2, the language that consists of all 3-symbolwords: 3 ={aaa, aba, baa, bba, aab, abb, bab, bbb}. To make this recursive definition agree with the basis case n =1, = 0 , zero power 0 is defined as 0 = {}, (no matter what is ). Then {} ={ x | x } = { x | x }= What is 2 3 … n ?Example: ={a, b}, = 2 = {aa, ab, ba, bb} So, we can define recursively for any n>1: n = n-17‘Kleene star’ notation: * = 0 1 2 … So, * is the (infinite) set of all possible words over alphabet ,including empty string . Example. = {0, 1}. * is an infinite set of all possible bit strings.(or all binary numbers including numbers with leading 0’s and empty string).Any language L over alphabet is a subset of *, L *. Note that {} , because ={} {}| { }|=1, ||=|{}|=0.A language L may contain , or may not.8Example. Consider two languages over alphabet = {a}: L1={aa}, L2={, aa, aaaa}. What is L1*? By definition of Kleene star L1* = L10 L11 L12 … ={}{aa} {aaaa} {aaaaaa} …What is L2*? L2* = L20 L21 L22 … ={}{, aa, aaaa} {, aa, aaaa, aaaaaa, aaaaaaaa}…={, aa, aaaa, aaaaaa, …} = L1* = {, aa, aaaa, aaaaaa, …}infinite set of strings of even length build from symbol a.9Definition. A string u is called a substring of v if there exist twostrings x and y, such that v = xuy, and x, y * Definition. A string u is called a prefix of v if there exists a string x *, such that v = ux. Similarly, a string u is called a suffix of v if there exists a string y *, such that v = yu.10Theorem 1. Let A, B and C be sets of strings. Then (AB)C = ACBC Proof. a) We need to prove the equality of two sets of strings. We can do it by double-inclusion, i. e. to show that i) (AB)C ACBC andii) ACBC (AB)C11i) To prove (AB) C ACBC, it’s suffices to show that for any string w, w(AB)C w A C BC Take any w (AB)C … (xA or xB) and y C (dfn of )(dfn of concat)x, y, such that w = xy and x(AB) and yC … (xA and yC) or (xB and yC) (distributive property) w AC or wBC (dfn of concat) w AC BC (dfn of )12ii) To prove that ACBC (AB)C, we need to show that for any string w, w ACBC w (AB)CTake any w AC BC w AC or w BC (dfn of )x, y, such that w = xy and (x A and y C) or (x B and y C) (dfn of concat) w (AB)C (dfn of concat)So, we proved ACBC (AB)C and (AB)CACBC,that means (AB)C = A CBC So we can have two cases. In the first case, (x A and y C) implies that (x AB and y C) because A (AB).In the second case, (x B and y C) implies that (x AB and y C) because B (AB). So, in either case we have13Proof. To prove subset relation we need to show that for any string w, w(AB)C wAC BC. Why not to prove ACBC (AB)C as well? Let’s try. Take arbitrary
View Full Document