Cryptology Procedures
Cryptology procedures provide methods for plaintext preparation and strategies of analyzing ciphertext including the Kasiski Method for determining key length..
Plaintext Preparation
A cryptanalyst uses regularity and predictability as part of his analysis technique. To counter this, an encipherer normally uses certain procedures to counter this predictability when preparing the plaintext message.
- Short Messages: The longer the text the more chance of patterns occurring and patterns are what cryptanalysts look for. Keep the plaintext as terse as possible to make decryption more difficult.
- Reduced Use of Common Words: Common words such as the, an and of can often be omitted without loss of message sense.
- Reduced Use of Abbreviations: If repeated use of a common abbreviation is required, on random occasions it should be spelt out. Once again a high repetition rate can lead to breaking the cipher.
- Avoidance of Likely Cribs: Certain words or phrases can be anticipated by code-crackers. Avoid them if possible as they can trigger signs of suggest in breaking a cipher.
- Standard Formats are Bad: Often opening and closing information is routine such as who the message is for and courtesy closings. By breaking the plaintext into sections and then intermixing them in a random fashion, there is a chance that the 'preamble' and 'postamble' may go undetected.
- Bad Spelling is Good: Variation of a word's spelling such as American/British forms and even homonyms or misspellings can be used to reduce repetition. Remember the key is to introduce randomness.
- Case Insensitivity: Uppercase may indicate sentence beginnings and/or proper nouns. Reduce plaintext to a single case! Uppercase only is commonly used for readability.
- No Word Breaks: Ciphertext messages should never be written in word format. A common method used in radio transmission is to block text in five character sequences. Another technique which may confuse analysts for a short while is to write the cyphertext in random word like order. This is a delaying tactic but all ciphers are breakable and it is time that the encipherer is buying.
- No Punctuation: Punctuation makes sentence endings and beginnings predictable and so its elimination is mandatory. Embedding words like stop in the text is also foolish. The deciphered text should be readable and punctuation replaced manually if required.
- Random Message Fills: If a message is blocked in fixed character length format for transmission, fill characters should be random and not the predictable XXX string. And for variation, fills can be prepended to the text rather than appended.
An interesting note is that the native language of the plaintext may cause problems for decrypting (reading by cracking the unknown algorithm). Sometimes it is just a localism or obscure way of saying something. Or the plaintext can be anagrams easily read by a native but difficult for others such as:
Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht frist and lsat ltteer is at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae we do not raed ervey lteter by itslef but the wrod as a wlohe.
Another example is the use of an obscure foreign language to delay decryption until the message's purpose has been served. The Navaho code talkers of WW2 fame is a perfect illustration. Look at the code used at Code Talk.
Cryptanalysis
One of the strongest tools of the cryptanalyst (aka code 'cracker') is frequency counts. Using known letter, digraph, trigraph, vowel and word counts in native languages and the ciphertext under analysis, clues as to the letter mappings can be tested in a logical order. You may wish to refer to the suite of programs that I have developed for this purpose.
ETAOIN SHRDLU is a mnemonic for remembering the order of frequency of letters in English. That is E, T and A are the most common! The vowels AEIOU are found in 38.5% of the text.
The most common digraphs are: TH IN ER RE AN HE AR EN TI TE AT ON HA OU IT ES ST OR
The most common trigraphs are: THE ING AND ION ENT FOR TIO ERE HER ATE VER TER THA
The 100 most common English words are:
THE FOR HAVE THIS BEEN WHEN ITS GREAT CAN COULD OF AS YOU MY HIM WHAT OUT NOW MADE VERY AND WITH WHICH THEY ONE YOUR INTO SUCH WELL MUCH TO WAS ARE ALL SO MORE OUR SHOULD OLD OWN A HIS ON THEIR IF WOULD THESE OTHER MUST MOST IN HE OR AN WILL THEM MAN ONLY US MIGHT THAT BE HER SHE THERE SOME UP ANY SAID FIRST IS NOT HAD HAS WHO THAN DO THEN TIME AFTER I BY AT WERE NO MAY LIKE ABOUT EVEN YET IT BUT FROM ME WE UPON SHALL THOSE NEW TWO
The preceding statistics are from Cryptanalysis by Helen FouchéGaines.
Trial Examples
The following is a simple exercise to practice on using intuition and/or frequency counts.
¡3XC3LL3N7 3X3RC153! 0n3 5umm3r d4y 45 1 w45 47 7h3 834ch 1 54W 7w0 61rl5 PL4Y1N6 1n 7h3 54nd. 7h3y w3r3 w0rk1n6 h4rd 70 8u1ld 4 54nd c457l3 w17h 7urr375, h1dd3n p4554635 4nd 8r1d635. 45 7H3Y W3R3 JU57 480U7 70 F1N15H 4L0N6 C4M3 4 L4R63 w4v3 d357r0y1n6 3v3ry7h1n6 4ND r3duc1n6 7h3 c457l3 70 4 l07 0f 54nd 4nd f04m. 1 7h0u6h7 7h47 4f73r 50 much 3ff0r7 7h3 61rl5 w0uld 8361n 70 CRY, 8u7 1n5734d 7H3Y r4n d0wn 7h3 834ch l4u6h1n6 4nd pl4y1n6 4nd 574r73d 70 8u1ld 4n07h3r c457l3. 1 r34l1z3d 7h47 1 h4d l34rn3d 4 6R347 l3550n. w3 0F73N 5p3nd 50 much 71m3 1N 0ur l1v35 8u1ld1n6 UP 7H3 M473R14L 4ND W0RLDLY P0553550N5 0F L1F3. 50M3 71M35 1N 0UR L1V35 4 w4v3 c0m35 4L0N6 70 d357r0y 3v3ry7h1n6 W3 H4V3 8U1L7. 4ND 700 0F73N W3 F0CU5 0N 0UR 54DN355 4ND FRU57R4710N 7H47 7H3 73MP0R4L 7H1N65 0F L1F3 H4V3 833N 74K3N FR0M U5 4ND W3 F0R637 7H47 7H3 6R347357 7H1N65 0F L1F3 4R3 4LW4Y5 0UR5 F0R 7H3 74K1N6 7H47 0F L0V3, FR13ND5H1P, 4FF3C710N, 71M3, F417H, KN0WL3D63, H0P3 4ND 7H3 H4ND5 4ND H34R75 0F 7H053 W3 L0V3 4ND 7H3 5M1L35 4ND L4U6H73R W3 C4N 5H4R3 W17H 7H3M.
The following enciphered message can be 'cracked' using frequency counts alone. But you may want to use the mapper program to help you test your assumptions. Use java mapset /testMap >test.bat to set up a beginning batch file !!
SCYJT OPNRM JTUEA WSROR OAEPQ RJCRO ARMPH QKJQS RSJHA XPFKE AQRMY SRPQP MPSEC AHGAW SROPE EESHA QOPVS HIROA QPFAE AHIRO PHNPQ RJHTF UAMCJ MRYRO MAAWA EEBTQ RWMSR ASRJH AIMJT KUAEJ WPHJR OAMPH NQAAW OPRYJ TQAAL
Kasiski Method
Periodic polyalphabets can have their key length (ie period) found by using the Kasiski method [Ref 2]. It relies on the fact that repeated sequences in the ciphertext can signal repeated sequences in plaintext at a multiple of the period.