DOC PREVIEW
MIT 6 00 - Lecture Notes

This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

MIT OpenCourseWare http://ocw.mit.edu 6.00 Introduction to Computer Science and Programming, Fall 2008 Please use the following citation format: Eric Grimson and John Guttag, 6.00 Introduction to Computer Science and Programming, Fall 2008. (Massachusetts Institute of Technology: MIT OpenCourseWare). http://ocw.mit.edu (accessed MM DD, YYYY). License: Creative Commons Attribution-Noncommercial-Share Alike. Note: Please use the actual date you accessed this material in your citation. For more information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/termsMIT OpenCourseWare http://ocw.mit.edu 6.00 Introduction to Computer Science and Programming, Fall 2008 Transcript – Lecture 21 OPERATOR: The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu. PROFESSOR: So let's start. I have written a number on the board here. Anyone want to speculate what that number represents? Well, you may recall at the end of the last lecture, we were simulating pi, and I started up running it with a billion darts. And when it finally terminated, this was the estimate of pi it gave me with a billion. Not bad, not quite perfect, but still pretty good. In fact when I later ran it with 10 billion darts, which took a rather long time to run, didn't do much better. So it's converging very slowly now near the end. When we use an algorithm like that one to perform a Monte Carlo simulation, we're trusting, as I said, that fate will give us an unbiased sample, a sample that would be representative of true random throws. And, indeed in this case, that's a pretty good assumption. The random number generator is not truly random, it's what's called pseudo-random, in that if you start it with the same initial conditions, it will give you the same results. But it's close enough for, at least for government work, and other useful projects. We do have to think about the question, how many samples should we run? Was a billion darts enough? Now since we sort of all started knowing what pi was, we could look at it and say, yeah, pretty good. But suppose we had no clue about the actual value of pi. We still have to think about the question of how many samples? And also, how accurate do we believe our result is, given the number of samples? As you might guess, these two questions are closely related. That, if we know in advance how much accuracy we want, we can sometimes use that to calculate how many samples we need. But there's still always the issue. It's never possible to achieve perfect accuracy through sampling. Unless you sample the entire population. No matter how many samples you take, you can never be sure that the sample set is typical until you've checked every last element. So if I went around MIT and sampled 100 students to try and, for example, guess the fraction of students at MIT who are of Chinese descent. Maybe 100 students would be enough, but maybe I would get unlucky and draw the wrong 100. In the sense of, by accident, 100 Chinese descent, or 100 non-Chinese descent, which would give me the wrong answer. And there would be no way I could be sure that I had not drawn a biased sample, unless I really did have the whole population to look at. So we can never know that our estimate is correct. Now maybe I took a billion darts, and for some reason got really unlucky and they all ended up inside or outside the circle. But what we can know, is how likely it is that our answer is correct, given the assumptions. And that's the topic we'll spend the next few lectures on, at least one of the topics. It's saying, how can we know how likely it is that our answer is good. But it's always given some set of assumptions, and we have to worry a lot aboutthose assumptions. Now in the case of our pi example, our assumption was that the random number generator was indeed giving us random numbers in the interval to 1. So that was our underlying assumption. Then using that, we looked at a plot, and we saw that after time the answer wasn't changing very much. And we use that to say, OK, it looks like we're actually converging on an answer. And then I ran it again, with another trial, and it converged again at the same place. And the fact that that happened several times led me to at least have some reason to believe that I was actually finding a good approximation of pi. That's a good thing to do. It's a necessary thing to do. But it is not sufficient. Because errors can creep into many places. So that kind of technique, and in fact, almost all statistical techniques, are good at establishing, in some sense, the reproduce-ability of the result, and that it is statistically valid, and that there's no error, for example, in the way I'm generating the numbers. Or I didn't get very unlucky. However, they're other places other than bad luck where errors can creep in. So let's look at an example here. I've taken the algorithm we looked at last time for finding pi, and I've made a change. You'll remember that we were before using 4 as our multiplier, and here what I've done is, just gone in and replaced 4 by 2. Assuming that I made a programming error. Now let's see what happens when we run it. Well, a bad thing has happened. Sure enough, we ran it and it converged, started to converge, and if I ran 100 trials each one would converge at roughly the same place. Any statistical test I would do, would say that my statistics are sound, I've chosen enough samples, and for some accuracy, it's converting. Everything is perfect, except for what? It's the wrong answer. The moral here, is that just because an answer is statistically valid, does not mean it's the right answer. And that's really important to understand, because you see this, and we'll see more examples later, not today, but after Thanksgiving, comes up all the time in the newspapers, in scientific articles, where people do a million tests, do all the statistics right, say here's the answer, and it turns out to be completely wrong. And that's because it was some underlying assumption that went into the decision, that was not true. So here, the assumption is, that I've done my algebra right for computing pi based upon where the darts land. And it turns out, if I put 2 here, my algebra is wrong. Now how could I discover


View Full Document

MIT 6 00 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?