Decoding the IRA Read online
Page 4
With these tools in hand I was able to find likely English keys for nearly all of the recovered keys. As we read more of the archived messages and broken ciphers we found clues to the way keys were chosen. In a message on 5 May 1926 the IRA’s Department of Intelligence sent the unencrypted message shown in Figure 8, a list of key phrases to be used from 6 May through 14 May.23
Just to make the keys crystal clear to anyone who might obtain this page, a line was drawn through this list indicating the first twelve letters of each phrase: ‘Isms go in wave’, ‘Speak of the co’, and so on. These phrases may be from newspapers or magazines of the period. I was unable to find any of the sources in open literature. However, they were indeed used for some of the messages we decrypted: P69/48(50), sent on 6 May, used ‘ISMSGOINWAVE’ and P69/48(23), sent on 14 May, used ‘STALESTTRICK’. This represents a serious blunder in communication security, and cryptanalysts always welcome entries of this sort. It also broadened my search for keys from complete words and phrases to checking for keys starting at a word boundary but going for as many letters as needed, without paying attention to whether the key ends on a word boundary.
Figure 8. Sending keys in clear text, 5 May 1926.
One partially encrypted document included lists of keys used to communicate with each battalion and brigade, and with individuals.24 Even without decryption this kind of information can be very valuable to an opponent: it allows the analyst to see the extent and command structure of the army. Another partially encrypted message gave keys that had changed.25 Again, intercepting this message would be a great boon to the cryptanalyst, who might have lost contact with the keys but can continue reading the message traffic if this one is encrypted in a known key. To emphasise the importance of this message, the sender said in clear English ‘The following Key-words are now in use’ before giving the keywords in encrypted form.
As I broke more messages and found more keys using the Project Gutenberg book list, I found several keys that clearly came from Nathaniel Hawthorne’s The Scarlet Letter, including one dated 2 March 1927 with key ‘Surveyor Pue e’.26 The header of this message includes the notation ‘(Cipher – New formula)’. Looking further, we found and broke a message dated 24 February 1927 saying:27
Did courier at Xmas give you copy of Woolworth edition of novel The Scarlet Letter which was to be used for keys for cipher?
Using this clue I tried each possible starting point for keys in the Project Gutenberg online edition of The Scarlet Letter and found a number of keys that appeared in this book – many of them common phrases such as ‘however had be’ and ‘on this side of’, but some as distinctive as ‘the scarlet le’ and ‘a writhing hor’.
However, a message dated 14 December 1926 detailed the method completely:28
Herewith method for using a different keyword for each.
Dispatch bearer will give you book to be used for this purpose.
Take the date of dispatch you are about to send.
Multiply the month by ten and add the date.
This gives you a number.
Take the page in the book corresponding with this number.
The first twelve letters in the fourth line on this page will be your keyword for that date.
For example take the date of this dispatch.
The number found is one hundred and thirty four.
The first twelve letters in the fourth line on page one hundred and thirty four are lampandsomet.
This would be the key word for this dispatch.
Verify this with book.
Name of book is The Scarlet Letter by Hawthorne.
As the sender suggested, I did indeed verify this with the book. I was unable to find the correct edition, but by comparing the position of the phrase in the digitised book with the page number derived from the above method using the date of the message I found that the key locations did indeed line up very well. We attempted to find an edition of The Scarlet Letter that appeared with these keys in exactly the right place, but without success – for example, none of the dozen or so editions in the University of California at Los Angeles (UCLA) library came close to matching the data. If we could find the correct edition, then the keys for these messages could be found the same way the original recipient would have found them: by finding the page number from the date of the message and going directly to the fourth line of the text to read off the key.
I wrote to Jude Patterson, a fellow cryptanalyst who is good at finding ‘hats’ (the original English of the equivalent key) for transposition keys. Jude had spent many years as a typesetter, and she had an interesting thought: that by testing different fonts we might be able to reconstruct the Woolworth edition closely enough to find precisely where the keys should fit. I sent seventeen recovered keys with their associated page numbers, as well as photocopies of some pages from cheap editions of various books from Britain and America from that period to give an idea of contemporary standards, fonts and conventions, and she went to work. Six months later, near the end of 2007 the breakthrough appeared in my inbox: Jude found that using eleven-point type with Garamond font produced results that almost exactly matched our data points for the Woolworth edition, suggesting that this may indeed have been the type face and size used in the original. She had needed to reconstruct by trial and error esoteric typographical conventions such as standards for dealing with widows (the last line of a paragraph at the top of a page) and orphans (the first line of a paragraph at the bottom of a page).
Jude Patterson wrote:29
11 pt Garamond by 19¼ picas is the only trial where most keys fell spot on or with minimal adjustments in hyphenation.
Having found this so-called ideal setting, I proceeded to trials for page depth, and having established page depth, I took the whole slice of The Scarlet Letter from pages 29 to 134 and ran the final trial. It was amazing how little twiddling was needed to get the pages to fall beautifully. I put myself in the shoes of the typesetter, trying to keep all pages the same length, allowing orphans but disallowing widows, inserting hyphens to reduce big spaces between words, ‘feathering’ the type with extra letter-spacing where needed to gain a line to avoid a widow.
We now had a best-guess equivalent of the Woolworth edition used to produce many of the keys for this period! For example, here is the beginning of page fifty-six with these settings:
external matters are of little value and import, unless
they bear relation to something within his mind. Very
soon, however, his look became keen and penetrative.
A writhing horror twisted itself across his features,
The key on the fourth line was used to encrypt a message dated 6 May.30 May is the fifth month, so we multiply it by ten and add the date: 50+6= 56, the correct page number.
Decrypting the substitution ciphers
Nearly all the ciphers we encountered in these sets proved to use columnar transposition, either with or without the columns of dud letters. However, a substantial number of messages between GHQ and the outlying Irish battalions used a different system: mostly short fragments of cipher to encrypt the most sensitive parts of a message that was otherwise sent in clear English. We see a typical example of this in Figure 9:31
Have you yet got X&OYC&UIJO&MN? Did you look up that man FX&WA HKGKH/ whom I spoke to you about. I am most anxious that this case be followed up. I would suggest that if necessary you put your Staff Officer entirely on it until it is carried through.
Figure 9. Short Vigenère-style substitution ciphers, 4 May 1926.
These ciphers are strikingly different from the columnar transposition ciphers that form the bulk of the encrypted traffic. They’re very short, they are not broken up in five-letter groups, they include the symbol ‘&’, and they are mixed freely with plain English text. Most importantly to a cryptanalyst, their statistics are quite different from normal English: a count of the individual letters shows no obvious correlation between the high and low frequency letters in the mess
ages and the high and low frequency letters in English. This indicates that the cipher used for these messages is a substitution cipher.
An easy and commonly used substitution cipher is called, appropriately enough, ‘simple substitution’. In this system a keyword may be chosen to mix the alphabet by any of a large variety of methods, and each letter of the plain text is substituted with the corresponding letter of the keyed alphabet. For example, with the key MONARCHY placed in a prearranged position within the alphabet, we could have a cipher alphabet that looks like this:
Plain: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Cipher: Q S T U V W X Z M O N A R C H Y B D E F G I J K L P
A message is encrypted by finding each letter of the message on the ‘plain’ line and substituting for it the letter below on the ‘cipher’ line:
On a monkey’s day to die all trees become slippery.
HC Q RHCNVL’E UQL FH UMV QAA FDVVE SVTHRV EAMYYVDL.
With enough cipher text we can solve a simple substitution rather easily by looking at frequencies (e.g. the very common ‘E’, ‘T’, ‘A’) and pattern words (e.g. ‘trees’ with its double ‘E’ or ‘become’ with ‘E’s in the second and sixth positions). Analysing the messages this way got me nowhere. My chief roadblock was the length: most of the messages were too short to allow productive analysis. I set these substitution ciphers aside for several months while continuing to work on the outstanding transposition ciphers.
Having finished most of the columnar transpositions I returned to an intriguing set of substitution cipher messages from an IRA communications logbook, shown in Figure 10.32 The reward for solving these pages was clear: it is a list of encrypted keywords used to communicate in cipher with each of the IRA units, from Antrim (‘No code yet’) through Wicklow, as well as additional keys for correspondents out of Ireland. In all, the list contains keywords or contact information for fifty-seven recipients. Although most of these are short words or phrases, I hoped to combine them in a way that would give me some leverage into their solution. I resolved to try each likely common substitution method in turn.
After simple substitution, the next most common substitution cipher is known as Vigenère, named for sixteenth-century cryptographer Blaise de Vigenère.33 This method uses a key to choose among a number of different cipher alphabets to encrypt each letter of the cipher in turn. Using multiple alphabets increases the security of the cipher by evening out the frequencies of the letters and by eliminating the patterns of the letters within words. In its most basic form, for each alphabet Vigenère uses the Caesar cipher described earlier, counting down the alphabet one letter for key-letter ‘A’, two for key-letter ‘B’ and so on, using the key in order and repeating it as needed. As an example using keyword FACE:
Some writers, including Lewis Carroll, called the cipher ‘undecipherable’, but cryptographers of the sixteenth century had already broken it on occasion.34 The Confederate States of America trusted it implicitly, and used it throughout the American Civil War with only three keys. The northern side (the Union) had no trouble reading their message traffic.35 The cipher may be executed entirely by hand, as shown above, or with a twenty-six by twenty-six table showing each alphabet, or using a cipher disk or slide that can be moved to indicate the correspondence between plain and cipher letters.
Figure 10. Encrypted cipher keywords, communications logbook, no date.
The cryptanalyst’s leverage in Vigenère-like ciphers comes from the periodic nature of the cipher. All letters encrypted with ‘F’ above are from the same alphabet, so that if we look at every fourth letter we will be seeing only letters encrypted with the same key letter. If we have enough material to work with, this alone will be enough to break that particular alphabet, because the equivalent of ‘E’, ‘T’, ‘A’ and so on will have the highest frequency in the cipher alphabet, and they will be in the same position relative to each other because the cipher alphabet is a simple Caesar shift of the standard A–Z alphabet. For example, the cipher equivalent of ‘E’ will appear four letters after the cipher equivalent of ‘A’. This process may be repeated for the other assumed alphabets, finding the best Caesar shift for each.
To test for a Vigenère-style cipher, then, we need enough material encrypted in the same key to find a statistical pattern in the letter distributions. Although most of the words encrypted in Figure 10 are short, I postulated that each could be encrypted with a polyalphabetic cipher such as Vigenère with the key beginning anew with each entry.
I selected the first six letters of all the ciphers in this group with at least six letters and that did not include the nonalphabetic character ‘&’ within those six letters. This gave a depth of twenty-two encrypted words all (by assumption) starting at the same place in the key:
SDRDPX VVQDTY WXGKTX SJMCEK LPMOCG MVLLWK HMNMLJ VDBDFX UMDMWO GGCOCS MMNEYJ KHAKCQ LPQXLI HMHQLT IJMPWG DDMCEX HVQDSU OISOCX DXNXEO IJLWPS IJNBOO OIREAK
I presented this to my Shotgun Hillclimbing programme and told it to try it as period six Vigenère and it immediately returned mostly reasonable text with key GVZKLG:
mister partis qchair monste funera famble brocad pictur orecli alderm ground embark furnac brigan confla xinstr bartho interr xconti commem coordi insupe
These beginning fragments plainly show that the method used was equivalent to Vigenère, using a keyword of at least five letters: the repeated ‘G’ in the key could mean that the key has begun to repeat, or that the keyword used has a repeated letter in that position. Further experimentation with the longer words and passages in this document showed that the latter is true: the full Vigenère keyword is GVZKLG, and it repeats as long as necessary to finish the section it encrypts. The ‘&’ could now be determined from context: if it is replaced with ‘Z’ in the cipher text, it encrypts to the right letter using standard Vigenère decryption.
Why GVZKLG? After some experimentation, I found that it is derived from a reversal of the alphabet:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Z Y X W V U T S R Q P O N M L K J I H G F E D C B A
If each letter in the Vigenère key is replaced with the corresponding letter in the reversed alphabet, we get G=T, V=E, and so on, so that the actual key used in ‘IRA Vigenère’ is TEAPOT. A reversed encryption alphabet of this sort is called Atbash, and the technique was used in the Bible. David Kahn points out that in Jeremiah 25:26 and 51:41 the word Sheshach appears in place of Babel (Babylon).36 The repeated Hebrew ‘beth’ of Babel becomes the repeated ‘shin’ of Sheshach: beth is the second letter of the Hebrew alphabet, and shin is the penultimate. Kahn cites an Aramaic paraphrase of the passage using Babel in place of Sheshach to prove they mean the same thing. Using Atbash for the IRA substitution keys may have been chosen to give a little more security against someone who suspects a Vigenère cipher system and has captured an English key to try.
Some of the keywords in this table include letters at beginning and/or end that are not part of the actual keyword. For example, the encrypted key for Boyne Batt is WXGKTXB, which decrypted with keyword TEAPOT becomes QCHAIRV. The longer encrypted key for Claremorris Bde is DDMCEXAXSS&T, which becomes XINSTRUCTION. In the course of later decryptions it became clear that these include nulls: the key for Boyne is actually CHAIR, and the key for Claremorris is INSTRUCTION. In the substitution examples throughout the corpus we found that nulls were frequently used at the beginning and/or end, especially ‘Q’, ‘X’, ‘Y’ and ‘Z’.
We can now address the message in Figure 9 that began this section:37
Have you yet got X&OYC&UIJO&MN? Did you look up that man FX&WA HKGKH/ whom I spoke to you about.
The message is an internal GHQ communication from the chief of staff to the director of intelligence, so the key used is the same as the one used to encrypt all the keys for internal consumption: TEAPOT. The solution is unambiguous:
Have you yet got report on Keogh? Did you look up that man z Campbell x whom I spoke to you about.
Th
e second encrypted bit includes the nulls ‘Z’ and ‘X’ in an attempt to disguise the name further. Repeated uses of the name encrypted the same way with the same key would be a security problem: even if a person intercepting the messages could not solve the cipher, they could tell that the same person was being discussed because the encrypted version would be the same. Adding a letter to the front (z Campbell x) is enough to make it different, but unless different numbers of letters are used each time the name is sent in the future, CAMPBELL will be encrypted with the same part of the key and will appear the same when encrypted.
The particularly interesting message shown in Figure 11 from Seán Lemass, the republican Minister for Defence to Seán Russell, the quartermaster general, was one of the most cryptic and one of the shortest.38
Figure 11. Soviet use of IRA officer, 3 October 1925.
We knew from other messages that ‘Mr. X’ was a Soviet agent. The encrypted part is ETNMMEE. Decrypting this with the GHQ substitution keyword TEAPOT gives us the plain text: YY OCB YY. As we’ve seen above, the ‘Y’s are nulls used for padding around the three-letter message. OCB (or OC.B) stands for Officer Commanding, Britain.39
Choosing the correct keyword to decrypt the substitution ciphers in this collection was sometimes challenging. Not all the messages had obvious senders and recipients, so it was not simply a matter of pulling the keyword off the master list for those correspondents. In addition, not all of the keywords were on any list: some had been superseded, and others were assigned after the lists in our possession were compiled.