Hi everyone. In the previous lecture, we learned TVFAE and how to interpret it. We defined a new language called TVFAE, so that we can define our own language we have a new language, our own language that allows user-defined types and how to interpret it. Today, we are going to learn how to type check that language. The relevant part in the textbook is Chapter 21, algebraic data types. Before looking into the typing rules, let's look at this example, we have two type definitions and an expression. First, we defined a new type called fruit, which has two variants, apple and banana. Apple and banana, they both have only one field of the same type number. The second type is called color and the color type has two variants, again with apple and banana our constructors are just the same with the other type called fruit Is it okay. This time, apple takes a field of type num arrow num, banana takes a field of type Boolean. They help to variants, with the same names but different field types. What's going to happen? Let's look at the expression. It's a function called expression the function part is this and the argument part is that the argument part is banana, true it's like that. The function part, expect a fruit of fruit. It has variant of name banana but it expects a number, but this time it has a true. Is it okay? This function just pattern matching on the argument value because we said that x type is fruit. We are expecting apple and banana takes numbers. What's going to happen?Do you want to allow this or not? I don't want to allow that. How can we reject this code using a type system? Let's start with well-formed type expression. Previously, the types were given numbers, booleans, strings, and arrow types from types to types, pair types from type using types and types. We did not define new types by ourselves. That is okay. Now that programmers can define new types, now that we allows user-defined types, there may be a problem because people make mistakes. When we defined a new type called fruit, with apple and banana, then it's okay to use a function expression whose parameter type is fruit. That's okay. But if we did not define any type called fruit, if we just used the name fruit, it's an error. Free type identifier error or something like that. We'd like to check that types are good to use. That's called well-formed type expressions. Online type checking of an expression. Because show this on the type environment Gamma on expression is type is Tau. This time under type environment Gamma, some time is good, that's what we mean by well-formed type expression. Or given primitive, type, number, boolean, string they are good. On arrow type Tau 1 to Tau 2 is well-formed. Then when Tau 1 is well-formed and Tau 2 is well-formed. We can write a similar rule for pair. Tau 1 pair Tau 2 is well-formed when Tau 1 is well-formed and Tau 2 is well-formed. So far so good. That's good. How about a type name T? Type name T is well- formed when the type names definition is in the type environment Gamma. Reasonable. A name x should be in the domain of Sigma. Remember, that's how we check the free identifier error. Now, here are two typing rules for type definition and use. Type definition and pattern matching. Don't panic. I'll explain that to you. How do we type check this type definition. Use any. Under a given type environment Gamma, we'd like to type check new type T with two variants, X_1 and X_2 whose fields have types Tau 1 and Tau 2 respectively. What's going to be the type of this whole expression? That's the question. Then first, this programmer wrote two types, T1 and T2 we do not trust programmers. We do like to check whether programmers made a mistake or not so we need to check where firmness of T1 and T2. Now that our language allows user-defined types, whenever a programmer writes a type in a code, when you need to check the real firmness of that type or notation so we're going to check the real firmness of T1 and T2 on which type environment actually, we need to extend this type environment gamma with what? With three pieces of information. Three, so many? Because we need to add this new type definition, so that when we used T in this T1 and T 2, we can check that one more time. Remember, the main motivation, the beginning of this defining new language called QVFA was to have recursive data type. What do we mean by recursive datatype, when we define type t, we may use that type t. That's what we mean by recursive data type. When we define type t, we may use type t, where in T1 and T 2, does it make sense? Because X1 and X2 are variant names, empty, cons, rows, and really there are no place two type can occur. Then this type T may appear in the type of the field, just like numlist is either empty or cons, conses field has type a pair of a number, and numlist here, the t may appear here. That's why when we check the real firmness of T1 and T2, we need to use this extended type environment. When we type check E, we need to use this extended type environment gamma prime. Gamma prime is going to be extended environment. How do you make it on top of this gamma, we add this type definition as it is, and two more pieces of information. What are they? They are the types of the names that may appear in E. One more time. The types of the names that may appear in E. In e, we may use X1 and X2 as Constructors what are constructors, constructors are waiting for values to make variants. One more time, constructors wait for values to make variants. What are variants? They are constructors, and argument values of type t. Are you with me? X1 and X2, they're actually function types. They have function types because they are expecting argument value of type T1 and T2 respectively. When those constructors take values of the type, they are going to make variants of type T. Are you with me? Let me repeat this one more time, and move on to pattern match. The type of this type definition and use of that in E. In E, we may use type t, In e, we may use X1 and X2 as constructors to make variants. When X1 and X2 are used here, they could be just constructor values or more effectively, they may get values of type T1 and T2 respectively to make variants of type t meaning that X1 and X2 are constructors. Meaning they have every time to make a value of type t, when they get corresponding argument values of type T1 and T2 so we add this information and the definition of T2, this giving gamma, and let's call that gamma prime using that extended gamma prime, we check the real firmness of those two types written by programmers, T1 and T2, and after all that, we type check this expression under gamma prime, get the type. That's going to be the type of the whole expression. So far so Good, hope that makes sense to you otherwise, read the textbook. Now, let's move on to pattern matching, so we have a match of the value of this expression. This language syntax tells us that the value of e_0 should be variant value whose constructors should be either x_1 or x_2. Meaning that we actually assumes the type of e_0. Meaning that when we type check this expression e_0, it should have some type t, which is defined in the given Gamma. That should be already defined in the given Gamma, otherwise, it's an error. I don't know the type t. In that time environment, the t's definition should tell us the names of those constructors, and their argument types. Now that we have all the information, all we need to do is to type check this expression e_1 and e_2 and make sure they have the same type. What are their types? e_1's type, when we type check e_1, we may use x_3. We know x_3 is a type. How? That's key definition says. The definition of t says that the variant x_1 is expecting a value of type Tau 1. The variant x_2 is expecting a value of type Tau 2. You're going to extend the Gamma using this information, x_3's time is going to be Tau 1. Check the type of e_1, x_4's type is going to be Tau 2 check the type of e_2, those two branches type should be just the same. That's going to be the type of this whole expression. So far good? Yeah, it looks difficult but not really. Again, please read a textbook. That textbook is really explaining this content clearly. So far so good. I have a bad news for you. You learned a lot until now, but we have a little tiny problem with this typing rule. But don't worry, I'm going to come back to that in this lecture. Let's implement this then. Type environment is now our mapping from not only names to types, but also names to its definition. One more time. Type environment, Gamma used to have types of names. Now that we extended its Gamma to include t's definition, we are abusing the notation, I'm sorry. We added another information saying, you know what? NumList has two variants with empty and cons, that information. When we look up the type of a name, we're going to use addVar function. When you look up the definition of a time name, we're going to use addTBind. I know it's not a clean implementation, but short implementation. I strongly recommend you to revise this implementation to be cleaner and more beautiful. You can do that. Then, how do we implement this where from to type expression checking? Let's define a new function called validType. Under a given time environment, NumT is always good. ArrowT type is good if primary type is valid and return type is also valid. Pair type is going to be very typed, very formed, if the first component type is valid and the second component type is valid. How about type name? If the type name is defined in a type environment, that's good. That's simple. The implementation was simple. As I said before, now that TVFAE allows programmers to write down types here, in the previous language like TFAE or TPFAE, types were not user defined. We did not need to check the type annotation. But now that programmers can define new types, we need to check. Function expressions type checking now as this what validity check of parameter type. This new language feature usually define types even require the change in the previous existing type system. Now let's implement this, the type checking of type definition. Are you ready? Here's the implementation. Not that difficult. Let's see. Type definition of type name with first variant and second variant. How do you do that? Let's extend this Gamma with some information. First, tyEnv at the definition of type name. Let's call that tyEnvT. We extend its Gamma with this information, that's tyEnvT. Then extend this type environment with two more pieces of information, same. The first variance type is an arrow t The second variant constructor's type is arrow t. From their field's types to these two type name. Good. We're going to check the reaffirmness of Tau 1 and Tau 2 under Gamma prime, under tyEnvT. We check the reaffirmness of Tau 1 and Tau 2 under Tau Gamma prime, which is tyEnvV but check that under tyEnvT. I'm going to come back to here in a minute. Then type check this expression on the Gamma prime, which is tyEnvV. That's good. Did you see the discrepancy? One more time. In the typing rule, we have Gamma and Gamma prime two-type environments. In the implementation, we have three type environments. The given type environment, tyEnv, and tyEnvT and tyEnvV. Why three? Technically speaking, this implementation does not capture the typing rule precisely. They are different slightly, but their effect is just the same. To emphasize that this information is only needed for the checking of Tau 1 and Tau 2 and this two more information is only needed for type-checking e_0. That's why I showed you two-type environments. When we check the validity of types, we only need tyEnvT. When we check the type of V we need everything. Are you with me? Can you believe me? Try to check whether that's correct or not what I just said. That's going to help you understand this material better. How are we going to implement this parametric? Yes, again, the code is long, but the extra content is not that complex. We first type check the conditional expression, this matching being matched to expression c. Then it should be some type name, otherwise, it's an error. It's a type name, then get the definition from this time environment. We got it. Then we need to check e_1 and e_2's type under which environment. On top of the given environment, we add this information saying these names. The name whose value is going to be the field of this variant. Then its type is going to be Tau 1. Here the second branch here is something like that. Using that information, get the type and they should be the same. Read the code, read the typing rule, and try to check the analogy. Reading the textbook is going to be helpful. Very good. You've gone through the operational semantics and static semantics of TVFAE. Now I told you that there was one small issue there. Before that let's see what we'd like to achieve by using type system. Usually when a language has an operational semantics and type system well-defined that we'd like to get on this new cool theorem called type soundness. Type soundness is a theorem of the form saying if under empty environment if we don't have any prior information and type check a given program e and this type is Tau. Meaning that, we type check a given program, and it also were typed, then actually running the code never produces an error. Isn't it cool? One more time. When we have a program, we check a type system, we check the program's type before excreting this program and the type checker says, okay to go, then, while excreting the program, there's no error. It's awesome. That's what we've been looking for. So far so good. But if we add division, then divide by zero errors or exceptions may be okay because we can attach there by using this type system. We are going to select the device that theorem saying, if type checker said okay then running e never produces an error except division by 0 or except some other exceptions. In general, soundness rules out a certain class of runtime errors related to types. One more time. When our type system is a type sound, then when our type checker says, okay, when we execute our program at runtime, there is going to be no error related to types. Yes. If our language, our type system, fails with this type soundness, meaning if the type checker said, okay, there is an error at runtime. It means that there is an error in your type syst. Really? Yes. Let's look at this example. That's why we made the typing rule for type definition intentionally wrong to show you this, meaning that our current version of the type system of TVFAE is not sound. Let's look at this example. It's a function call. The first function part has what? Type foo with two variants a and b, whose parameter types are number and the body is function expression. Its type is going to be foo arrow num and the argument part. It's another type definition to a, foo, but its variance a and b have different field types, especially a takes a Boolean. When we make a true, its type is going be foo, say a compile time. The function part has type foo arrow number argument type is foo, foo, foo. There should be no problem at runtime. But at runtime, if we run this code, this a true is going to be passed to foo, which expects a number. Runtime error. Type checker said, okay but it was not okay. That's what we mean by a bug in a type system. There's one solution, meaning that this problem was due to local type declaration, meaning that we are defining types here and there and everywhere. Let's avoid local type declarations. Prohibit that. Really, I don't like that. I want freedom. How about this? Do not let the name t escape its type definition. Again, scope. Type keys, scoop is here, not outside. That's going to be our solution. Meaning that when we define a type t, just use it here. Meaning that when we define a type t in the given type environment t should not already exist. There should be only one t. When we type check this expression, the type should not contain t, meaning that we checked away from this of Tau 1 and Tau 2. We check the type of e, all under this extended type environment Gamma prime. But the resulting type Tau should be we're formed under the given type environment, Gamma, not Gamma prime. This is going to be our solution. Very good. One more time. Type soundness is that, when a type checker said, okay, at runtime there should be no error related to pace. As I said, there could be errors and exceptions. Errors meaning that we have expected values, but the result are not the values which is going to be an error. We are not expecting subtraction from a number and a Boolean. We are not expecting calling a number in a function call site they are errors. But exceptions, meaning there could be division by 0, but it's technically okay, but its value is not really everybody exceptional cases. That's programmer's decision. When something happens that exceptionally, then we may throw some exception. There are different. A type checking we can detect those errors. But at type checking, we do not want to detect exceptions because they should occur at runtime. There should be some clear distinction by programmers between errors and exceptions. There are various errors that can be caught at parsing time, at compile time, at run-time. These are errors and again exceptions are different thing. At type checking, we'd like to reject bad programs as much as possible so that at runtime, when we interpret a program, there should be no type-related errors. That's what we mean by type checking and interpretation. Good job. Here's quiz. Which of the following is a role for well-formed arrow types? An arrow types is well-formed when its parameter type enrichment type are well-formed. Thank you.