DOC PREVIEW
Stanford CS 224n - Natural Language Processing - Lecture 01

This preview shows page 1-2-21-22 out of 22 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 22 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

NaturalLanguageProcessing-Lecture01 Instructor (Christopher Manning): Hi, everyone. Welcome to the first class of Stanford’s cs224n, which is an intensive introduction to natural language processing concentrating primarily, but not exclusively on using probabilistic methods for doing natural language processing. So let me just say a teeny bit about the structure of the course, and some of the administration. If you want more information on any of these things, the main thing to do is go to the website, which is cs224n.Stanford.edu. So I’m the instructor, Christopher Manning, and there are two TAs this quarter, Paul Boundstark and Dejuan Chang. The lectures right now are Monday/Wednesday 11:00 to 12:15 being broadcast live. Most weeks there’ll be a section on Fridays from 11:00 to 12:15. So for the handouts for today there's course syllabus, there's the first lecture, and then most importantly I’m handing out the first assignment already today, and I’ll say a bit more about that as the class proceeds. But this would be a good time to over the weekend look at the first assignment and check that you know how to do all the kinds of things that you need to be able to do to get productively working on the assignment next week in particular. So for this class there are three programming assignments that respecify to you, and then there's a final project, and most of the grade is that work. In addition to that, there are going to be just a few percent on weekly quizzes just to check that people are vaguely keeping up with all the other topics. But we’re seeing this as primarily a project based learning kind of class where a lot of the learning goes on in doing these projects. And so for the first three projects, all three of those projects, have a lot of support code for them to make it easy for you to do interesting things in the two weeks you get to do them in, and all of that support code is written in Java. In particular, it’s Java 1.5 for generics and stuff like that. So basically, a requirement for this class is that you can handle doing the Java programming. And so hopefully you've either seen some Java before, or it won’t be too difficult to kind of catch up to speed. Java’s obviously not that different from other languages like C++ or Python the way most things work. And you know we’re not – it’s not that we’re heavily using system’s libraries beyond a few basic things such as collections for razor lists. Most other information, see the course webpage. The little note on the bottom was I surveyed people where they actually want paper handouts of everything, or whether they’re completely happy on getting everything off the webpage. And it seems like about 2/3 of people are perfectly happy just to deal with stuff electronically. So what we’re going to do is we’re going to print just a few copies of handouts, but not enough for everyone that turns up. Okay, so I started talking a little bit about the class. I mean, in some sense this class is sort of like an AI systems class in that there’s less of a pure buildup coherently theoryfrom the ground up in a concerted way. And it's more of a class that’s built around, “Okay, there are these problems that we want to deal with in understanding natural language, and what kind of methods can we use to deal with those problems? And how can we build systems that work on it effectively.” So there's a lot of hands-on, doing things in assignments, working out how to get things to work kind of issues, and one of the things that you’ll find doing that is often that practical issues, and working out how to define things right, and making sure that the data’s being tokenized correctly, that a lot of these things can be just as important as theoretical niceties. And I think that’s true of the field of NLP as a whole as well, that if you look at papers in the field, papers tend to emphasize their key research idea that is cool and novel, but in practice they’re the kind of systems people build if the experimental results of the papers. Yes, they contain that idea, but they also contain a lot of other hard work getting all of the details right. And so that means that in this class we’re going to sort of assume that people have some background, and can exploit knowledge of a bit of linear algebra, a bit of probability and statistics, that they have decent programming skills. Now obviously not everyone has exactly the same background and skills. Some people know a ton about probability and machine learning, others not so much. So that we’re kind of hoping that as you go along that you can pickup things that you don’t know, are rusty on, and everyone can do well. And I think in practice that works out fairly well because even people who’ve done a lot of machine learning often haven’t seen so much of getting things to work in practical context. And so the class tries to do a mix between teaching the theory, and actually learning techniques that can be used in robust practical systems for natural language understanding. Okay, so where does the idea of natural language understanding come from? The idea of natural language understanding is basically is old as people thinking about computers because as soon people started thinking about computers, and thinking about robots, that they thought about wanting to communicate with them, and while the obvious way for human beings to communicate is to use language. [Video playing] I’m sorry, Dave. I’m afraid I can’t do that. Instructor (Christopher Manning): Okay, so there's my little clip from Hal 2001, but I mean, actually, the idea goes much further back than that. That if you go into the earliest origins of science fiction literature that you can find in 1926, Metropolis, that you have False Maria who was a nonhuman robot, and well, not surprisingly, you could talk to False Maria. And that’s just a very natural interface modality for human beings to think about. So in general the goal of the field of NLP is to say that computers could be a ton more useful if they could do stuff for us, and then as soon as they – you want computers to dostuff for us, well, then you notice that a lot of human communication is by means of natural language, and a lot of the information that is possessed by human beings whether it's Amazon product catalogs, or research articles that are telling you about proteins, that that’s information in natural


View Full Document
Download Natural Language Processing - Lecture 01
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Natural Language Processing - Lecture 01 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Natural Language Processing - Lecture 01 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?