DOC PREVIEW
Duke CPS 111 - Markov Chains

This preview shows page 1-2-3-4 out of 12 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 12 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

53Markov ChainsEquipped with the basic tools of probability theory, we can now revisit the stochastic models weconsidered starting on page 29 of these notes. The recurrence (14) for the stochastic version of thesand-hill crane model is an instance of the following template:y(n) = a sample from pYn| Yn−1,...,Yn−m(y | y(n − 1), . . . , y(n − m)) . (25)The stochastic sand-hill crane model is an example of the special case m = 1:y(n) = a sample from pYn| Yn−1(y | y(n − 1)) . (26)Recall that, given the value y(n − 1) for the population Yn−1(a random variable) of cranes inyear n −1, we formulated probability distributions for the numbers of births (Poisson) and deaths(binomial) between year n − 1 and year n. Because these probabilities depend on y(n − 1), theyare conditioned upon the fact that Yn−1= y(n − 1). If there are b births and d deaths, thenYn= Yn−1+ b − dso knowing the value y(n − 1) of Yn−1and the conditional probability distributions pb | Yn−1andpd | Yn−1of b and d given Yn−1is equivalent to knowing the conditional distribution12pYn| Yn−1ofYngiven Yn−1.A sequence of random variables Y0, Y1, . . . whose values y(0), y(1), . . . are produced by astochastic recurrence of the form (25) is called a discrete Markov process of order m. The ini-tial value y(0) of Y0is itself a random variable, rather than a fixed number.In the sand-hill crane model, the value that the population Ynin any given year n can assume isunbounded. This presents technical difficulties that is wise to avoid in a first pass through the topicof stochastic modeling. We therefore make the additional assumption that the random variables Ynare drawn out of a finite alphabet Y with K values. In the sand-hill crane example, we would haveto assume a maximum population of K −1 birds (including zero, this yields K possible values). Inother examples, including the one examined in the next Section, the restriction to a finite alphabetis more natural. A discrete Markov process defined on a finite alphabet is called a Markov chain.Thus:12To actually determine the form of this conditional distribution would require a bit of work. It turns out thatthe probability distribution of the sum (or difference) of two independent random variables is the convolution (orcorrelation) of the two distributions. The assumption that births and deaths are independent is an approximation, andis more or less valid when the death rate is small relative to the population. Should this assumption be unacceptable,one would have to provide the joint distribution of b and d.54 4 SCALAR, STOCHASTIC, DISCRETE DYNAMIC SYSTEMSTo specify a Markov chain of order m (equation (25)) requires specifying the initial prob-ability distributionpY0(y(0))and the transition probabilitiespYn| Yn−1,...,Yn−m(y(n) | y(n − 1), . . . , y(n − m))where all variables y(n) range over a finite alphabet Y of K elements. Note that in thisexpression y(0) and y(n), . . . , y(n −m) are variables, not fixed numbers. For instance, weneed to know the distribution pY0(y(0)) for all possible values of y(0).The Markov chain (25) is said to be stationary if the transition probabilities are the samefor all n. In that case, their expression can be simplified as follows:pY | Y−1,...,Y−m(y | y−1, . . . , y−m) .A Language ModelTo illustrate the concept of a Markov chain we consider the problem of modeling the Englishlanguage at the low level of word utterances (that is, we do not attempt to model the structureof sentences). More specifically, we attempt to write a computer program that generates randomstrings of letters that in some sense look like English words.A first, crude attempt would draw letters and blank spaces at random out of a uniform proba-bility distribution over the alphabet (augmented with a blank character, for a total of 27 symbols).This would be a poor statistical model of the English language: all it specifies is what charactersare allowed. Here are a few samples of 65-character “sentences” (one per line):earryjnv anr jakroyvnbqkrxtgashqtzifzstqaqwgktlfgidmxxaxmmhzmgbyamjgxnlyattvc rwpsszwfhimovkvgknlgddou nmytnxpvdescbg k syfdhwqdrjjmcovoyodzkcofmlycehpcqpuflje xkcykcwbdaifculiluyqerxfwlmpvtlyqkvThis is not quite compelling: words (that is, strings between blank spaces) have implausiblelengths, the individual letters seem to come from some foreign language, and letter combinationsare unpronounceable. The algorithm that generated these sequences is a zeroth-order stationaryMarkov chain where the initial probability distribution and the transition probabilities are all equalto the uniform distribution. “Transition” here is even a misnomer, because each letter is indepen-dent of the previous one:pY0(y) = pY | Y−1,...,Y−m(y | y−1, . . . , y−m) = pY(y) = pU27(y) (27)(pU27is the uniform distribution over 27 points).55A moderate improvement comes from replacing the uniform distribution with a sample distri-bution from real text. To this end, let us take as input all of James Joyce’s Ulysses, a string of1,564,672 characters (including a few lines of header).13Some cleanup first converts all charactersto lowercase, replaces non-alphabetic characters (commas, periods, and so forth) with blanks, andcompacts the possible resulting sequences of consecutive blanks with single blanks.As an exercise in vector-style text processing, here is the Matlab code for the cleanup function:function out = cleanup(in, alphabet)% Convert to lower casein = lower(in);% Change unknown symbols to blank spacesknown = false(size(in));for letter = alphabetknown(in == letter) = true;endin(˜known) = ’ ’;% Find first of each sequence of nonblanksfirst = in ˜= ’ ’ & [’ ’ in(1:(end-1))] == ’ ’;% Find last of each sequence of nonblankslast = in ˜= ’ ’ & [in(2:end) ’ ’] == ’ ’;% Convert from logical flags to indicesfirst = find(first);last = find(last);% Replace runs of blanks with single blanksout = ’’;for k = 1:length(first)out = [out, in(first(k):last(k)), ’ ’];endThe input argument in is a string of characters (the text of Ulysses, a very long string indeed),and alphabet is a string of the 26 lowercase letters of the English alphabet, plus a blankadded as the first character. The output string out contains the cleaned-up result. Loopswere avoided as far as possible. The loop on letter is very small (27 iterations), and the lastloop on k seemed unavoidable.14After this cleanup operation, it


View Full Document

Duke CPS 111 - Markov Chains

Download Markov Chains
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Markov Chains and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Markov Chains 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?