DOC PREVIEW
CORNELL MATH 135 - The Friedman and Kasiski Tests

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Math 135 (Summer 2006)The Friedman and Kasiski TestsExample: Suppose that a sample of ordinary English contains the following distribution of letters:letter count letter countA 141 N 119B 36 O 132C 36 P 28D 103 Q 1E 188 R 95F 37 S 64G 34 T 182H 102 U 59I 123 V 13J 4 W 55K 18 X 3L 56 Y 2M 27 Z 0Suppose that a pair of letters is selected at random from this sample. What is the probability that thetwo letters selecte d will be identical?Solution:If this sample is subjected to a monoalphabetic substitution, then the probability of choosing an identicalpair from the resulting ciphertext is the same.Friedman’s Index of CoincidenceThe index of coincidence (denoted by I) for a (cipher)text is the probability that two letters selectedat random from it are identical.If the text has n0A’s, n1B’s, n2C’s, . . . , n25Z’s, andn = n0+ n1+ n2+ · · · + n25=25Xi=0niis the number of letters in the text, thenI =n0(n0− 1) + n1(n1− 1) + · · · + n25(n25− 1)n(n − 1)=1n(n − 1)25Xi=0ni(ni− 1).In the previous example, we found I = 0.0656.Example: What is the index of coincidence for a collection of 2600 letters consisting of 100 A’s, 100B’s, 100 C’s, . . . , 100 Z’s?Solution:The index of coincidence of a totally random (uniformly distributed) collection of letters is about 0.0385.Vigen`ere ciphertexts from longer keywords have a more uniform distribution of letters. For keywordlengths closer to 1, the index of coincidence will be larger (closer to 0.0656).Question: Can we quantify the connection betwee n index of coincidence and ke yword length?Answer: YES!Connection between I and Keyword LengthSuppose that the ciphertext has n letters and the Vigen`ere keyword has k letters. Arrange the ciphertextas follows:Each column is a shift encipherment by an amount c orresponding to that key letter. If we choose a pairat random, then either• they come from the same column, or• they come from two different column.In the first case, there are k ways to choose the the column, C(n/k, 2) ways to choose a pair of letters,and probability 0.065 that they are identical. Thus, the expected number of identical pairs from thesame column isS = k · C(n/k, 2) · 0.065.(Remembe r, if there are N independent trials with probability p of an event happening on a given trial,then the expected number of events which happen is N p.)In the second case, there are C(k, 2) ways to choose two different columns, n/k ways to choose a letterin one of them, n/k ways to choose a letter in the second, and probability126≈ 0.03846 that they areidentical. Thus, the expected number of identical pairs from different columns isD = C(k, 2) ·nk·nk· 0.03846.Therefore,I ≈S + DC(n, 2)=k · C(n/k, 2) · 0.065 + C(k, 2) ·nk·nk· 0.03846C(n, 2)=k ·nk(nk−1)2·1· 0.065 +k(k−1)2·1·nk·nk· 0.03846n(n−1)2·1=0.065(n − k) + 0.03846n(k − 1)k(n − 1)and so if we solve for kk ≈0.0265n(0.065 − I) + n(I − 0.0385).Example: Suppose that a polyalphabetic ciphertext has the following letter counts:letter count letter countA 60 N 28B 50 O 83C 42 P 44D 64 Q 69E 51 R 13F 63 S 29G 19 T 66H 48 U 87I 56 V 63J 67 W 19K 23 X 43L 45 Y 39M 44 Z 67Use Friedman’s index of coincidence to estimate the length of the keyword length.Solution:The Kasiski TestThe following Vigen`ere encipherment illustrates an interesting phenomenon:THEWEATHERINTHEFALLHASTHEMSWEATINGSAFETYSAFETYSAFETYSAFETYSAFETYSAFELHJAXYLHJVBLLHJJTJDHFWMFWMXAXYLISKTHEWEATHERINTHEFALLHASTHEMSWEATINGSAFETYSAFETYSAFETYSAFETYSAFETYSAFELHJAXYLHJVBLLHJJTJDHFWMFWMXAXYLISKLetter groups in the ciphertext are repeated because letter groups in the plaintext line up with thekeyword.If letter groups are repeated in the ciphertext, then the keyword length may be a divisor of their sepa-ration.Example: Determine a likely Vigen`ere keyword length for the following ciphertext:HIBKA UQFLF SBQSX SKCFB YOAGP ALGTC RTYTLDGBYO AGPAL OAKYB FBILY OYQTD ISVAI JJNNADXNLW NRQPF BVPWN IWAFB YAANR URTZE LYZLFMEWHI BKAUQ FLALJ GTXRG VNIJP ZREQL KWZAThere are repeated letter groups:HIBKA UQFLF SBQSX SKCFB YOAGP ALGTC RTYTLDGBYO AGPAL OAKYB FBILY OYQTD ISVAI JJNNADXNLW NRQPF BVPWN IWAFB YAANR URTZE LYZLFMEWHI BKAUQ FLALJ GTXRG VNIJP ZREQL KWZAThe separation (start to start) between the two occurrences of HIBKAUQFL is 108 = 33· 22letters, andthe separation between those of BYOAGPAL is 18 = 32· 2 letters. Common divisors of these two numbersare 2, 3, 4, 6, 9, 12, and 18, and so these are the most likely keyword


View Full Document

CORNELL MATH 135 - The Friedman and Kasiski Tests

Download The Friedman and Kasiski Tests
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view The Friedman and Kasiski Tests and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view The Friedman and Kasiski Tests 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?