CORNELL CS 611 - Type Inference and Unification - D2889968

Home> Schools> Cornell University> Computer Science (CS) > CS 611> Type Inference and Unification

CORNELL CS 611 - Type Inference and Unification

Pages 4

Download Save

Unformatted text preview:

CS611 Lecture 27 Type Inference and Unification 6 November 2006Lecturer: Dexter Kozen1 Type InferenceType inference refers to the process of determining the appropriate types for expressions based on how theyare used. For example, ML knows that in the expression (f 3), f must be a function (because it is applied tosomething, not because its name is f!) and that it takes an int as input. It knows nothing about the outputtype. Therefore the typ e inference mechanism of ML would assign f the type int ->’a.- fn f => (f 3);val it = fn : (int -> ’a) - > ’aThere may be many different occurrences of a symbol in an expression, all leading to different typingconstraints, and these constraints must have a common solution, otherwise the expression cannot be typed.- fn f => f(f 3);val it = fn : (int -> int) -> int- fn f => f(f ”hi”);val it = fn : (string -> string) -> string- fn f => f(f 3,f 4);stdIn:19.9-19.19 Error: operator and operand don’t agree [literal]operator domain: intoperand: ’Z * ’Zin expression:f (f 3,f 4)In the first example, how does it know that the output type of f is int? Because the input type of f is int,and the output of f is fed into f again, so the output type of f has to be the same as the input type of f.If a program is well-typed, then a type can be inferred. For example, consider the programlet square = λz. z ∗ z inλf. λx. λy.if (f x y)then (f (square x) y)else (f x (f x y))We are applying the multiplication operator to z, therefore we must have z : int, thus λz. z ∗ z : int → intand square : int → int. We know that the type of f must be something of the form f : σ → τ → bool forsome σ and τ, since it is applied to two arguments and its return value is used in a conditional test. Sincef is applied to the value of square x as its first argument, it must be that σ = int. Since f is applied to thevalue of f x y as its second argument, it must be that τ = bool. The return value is also bool. Thus thetype of the entire program is (int → bool → bool) → int → b ool → bool.2 UnificationBoth type inference and pattern matching in ML are instances of a very general mechanism called unification.Briefly, unification is the process of finding a substitution that makes two given terms equal. Patternmatching in ML is done by applying unification to ML expressions, whereas type inference is done by applyingunification to type expressions. It is interesting that both these procedures turn out to be applications of thesame general mechanism. There are many other applications of unification in computer science; for example,the programming language PROLOG is based on it.The essential task of unification is to find a substitution S that unifies two given terms (that is, makesthem equal). Let’s write s S for the result of applying the substitution S to the term s. For example,f(x, h(x, y)){g(y)/x, z/y} = f(g(y), h(g(y), z)),1where the substitution operator {g(y)/x, z/y } applied to a term simultaneously substitutes g(y) for x and zfor y. The substitution is simultaneous, not sequential. Sequential substitution would give a different result:f(x, h(x, y)){g(y)/x}{z /y} = f(g(y), h(g(y), y)){z/y} = f(g(z), h(g(z), z)).Thus, given s and t, we want to find S such that sS = tS. Such a substitution S is called a unifier for sand t. For example, given the termsf(x, g(y)) f(g(z), w) (1)the substitutionS = {g(z)/x, g(y)/w} (2)would be a unifier, sincef(x, g(y)){g(z)/x, g(y)/w} = f(g(z), w){g(z)/x, g(y)/w} = f (g(z), g(y)).Note that this is a purely syntactic definition; the meaning of expressions is not taken into considerationwhen computing unifiers.Unifiers do not necessarily exist. For example, the terms x and f(x) cannot be unified, since no substi-tution for x can make the two terms equal.Even when unifiers exist, they are not unique. For example, the substitutionT = {g(f(a, b))/x, f(b, a)/y, f(a, b)/z, g(f (b, a))/w}is also a unifier for the two terms (1):f(x, g(y)) T = f (g(z), w) T = f(g(f(a, b)), g(f(b, a))).However, when a unifier exists, there is always a weakest or most general unifier (mgu) that is unique up torenaming. A unifier S for s and t is a most general unifier (mgu) for s and t if• S is a unifier for s and t,• any other unifier T for s and t is a refinement of S; that is, T can be obtained from S by doing furthersubstitutions.For example, the substitution S in the example above is an mgu for f(x, g(y)) and f(g(z), w). The unifierT is a refinement of S, since T = S U, whereU = {f(a, b)/z, f (b, a)/y}.Note thatf(x, g(y)) S U = f(x, g(y)){g(z)/x, g(y)/w}{f(a, b)/z, f(b, a)/y}= f(g(z), g(y)){f(a, b)/z, f(b, a)/y}= f(g(f(a, b)), g(f (b, a)))= f(x, g(y)) T.Note that we can compose substitutions, as we did in S U. This is the substitution that first applies S,then applies U to the result. The composition is also a substitution.23 Unification AlgorithmThe unification algorithm is known as Robinson’s algorithm (1965). We need unification for not just for apair of terms, but more generally, for a set of pairs of terms. We say that a substitution S is a unifier for aset {(s1, t1), . . . , (sn, tn)} if siS = tiS for all 1 ≤ i ≤ n.The unification algorithm is given in terms of a function Unify() that takes a set of pairs of terms (s, t)and produces their mgu, if it exists. If E is a set of pairs of terms, then E {t/x} denotes the result of applyingthe substitution {t/x} to all the terms in E.• Unify({(x, t)} ∪ E)4= {t/x} Unify(E {t/x}) if x /∈ FV(t)• Unify(∅)4= I (the identity substitution x 7→ x)• Unify({(x, x)} ∪ E)4= Unify(E)• Unify({(f(s1, . . . , sn), f(t1, . . . , tn))} ∪ E)4= Unify({(s1, t1), . . . , (sn, tn)} ∪ E).In the first rule, {t/x} denotes the substitution that substitutes t for x, and {t/x} Unify(E {t/x}) denotesthe composition of {t/x} and Unify(E {t/x}). Since we write substitutions on the right, we follow theconvention that composition is from left to right; thus S T means, “do S, then do T ”.One circumstance that causes a set of terms not to unify is if it contains a pair (x, t) where x 6= t but xoccurs in t; then no substitution can make x and t equal.4 Type Inference and UnificationNow we show how to do type inference using unification on type expressions. This technique gives the mostgeneral type (mgt) of any typable term; any other type of this term is a substitution instance of its mostgeneral type. Rec all the Curry-style simply typed λ-calculus with syntaxe ::= x | e1e2|

View Full Document


School:
Email:
New Password:
Confirm Password:

CORNELL CS 611 - Type Inference and Unification

Sign up for free to view:

Please select your school