preparing for himself a less haphazard listing of the cryptogram’s repeated sequences. Perhaps the most satisfactory way of doing this is to begin by making a general frequency count. Then, in order to have the more reliable information at once, start the tabulations by examining those letters whose frequency is only 2; follow this with an examination of those having a frequency of 3, and so on. The theory is that letters of these frequencies are much more likely to belong to only one alphabet, while the letters of higher frequency have probably been enciphered in several different alphabets, so that their repeated sequences are not so sure to be periodic.

Figure 105
Individual Frequency Counts - PERIOD 5
Alphabet 1 Alphabet 2 Alphabet 3 Alphabet 4 Alphabet 5
A 11 11 1 111
B 11 1 11 1
C 111 1 11
D 111 1 1
E 1111 1
F 11 1 1 111
G 1 111 1
H 11 11 1
I 11 11 1111
J 111 11 1 1
K 1 111 111
L 1 1 11111 11
M 1 111 11
N 11111 11 11 1 1
O 1 11111 1 11
P 111 111 1
Q 1111 1111 1
R 11
S 1111 1 11
T 1 1
U 1 11 11111 11 111 1
V 1 1 111 11
W 11 111 11 1
X 1111 11
Y 111
Z 1

For other cases in which there may be some doubt, the writer’s advice is to select large factors in preference to small factors. Or, if the decision must be made between two factors such as 6 and 7, where a period of 42 would be necessary in order to include both, simply select the handiest and give it a trial. With the longer cryptograms, as we shall see in a moment, an error in the choice is very speedily discovered; as to the shorter cryptograms, there is one rule which invariably holds good: If you meet with any resistance at all in dealing with the kind of ciphers which were shown in the past two chapters, you have probably selected the wrong period.

Often, however, where one clue is missing, there will be another present to take its place. Repeated trigrams are less likely than repeated digrams to be accidental, and longer repeated sequences are still less likely to be so. In the present tabulation, we find that three of the repetitions are trigrams; in all three cases the period 5 is suggested, while only one suggests also a period 3. That is, if we use a period of 15, two of these trigrams will have to be considered accidental.

If the period here is 5, then we are dealing with five simple substitution alphabets. These five alphabets have been used over and over again, always in a given rotation; therefore, if the cryptogram be rewritten into five columns (it is already conveniently grouped), the letters in each column will belong to one same alphabet, and it becomes possible to take a separate frequency count on each one of these five alphabets. These individual frequency counts may be seen in Fig. 105. Originally, we had a length of 170 letters, and, if the student desires to take a frequency count on the complete cryptogram, he will find that he has no truly predominant letters which could represent some of the letters E T A O N I R S H. Instead, he has a series of frequencies which are all fairly close to 4% of the text (6 or 7), and which, should he rearrange them in decreasing order, would have somewhat the following appearance: 10-10-10-9-9-9-8-8-8-7-7-7-7. . . . . .3-2-2-1. He will probably find, also, that every letter of the alphabet has been used at least once, something which would be very rare indeed in any normal English text of 170 letters. But in these five individual frequency counts, each belonging to a separate alphabet, matters are different. Here, the alphabets represented have a length of only 34 letters each, and yet, in the third, fourth, and fifth alphabets, there is one predominant letter, which could represent E, or some other letter which has taken the place of E, while, in the first and second alphabets, there are some few letters distinctly more prominent than others. Also, each alphabet has shown some gaps in sequence, where letters of the class J K Q X Z, and possibly also some letters like B P V W, would surely be missing in a normal text of only 34 letters.

A frequency count made on columns is not, of course, normal. We saw this in dealing with transpositions, when we considered vowel-distribution. Yet, as length increases, we find that the letters present in columns begin to approach more and more the proportions found in normal text; here, with only 34 letters, it would be possible, in any one of these frequency counts, to assign the letters to groups of high, moderate, and low frequencies. Whenever our frequency counts do not have this general aspect, the period cannot be correct. (There are, of course, the very short cryptograms, in which the actual frequencies are not apparent.) So far, we are dealing with any cipher whatever of the periodic type, and many of these ciphers do not make use of simple shifted alphabets, or even of alphabets which are in any way related to one another.

Now let us consider the one case in which the alphabets are all “Caesars.” In this case, whether the cipher is Vigenère, Beaufort, or Porta, we have only to identify one letter in order to identify a whole alphabet. Suppose we examine, first, alphabet 5, in which the one outstanding letter, L, has appeared 7 times. Does this letter represent e? If L of alphabet 5 represents e, then, counting backward (that is, upward), we find that the letter a will have to be represented by H; this alphabet, then, will be the H-alphabet if the cipher is Vigenère. The letter H has a frequency of only 1, which, in normal text, is not particularly satisfactory as the frequency of a, but this frequency count has not been taken from normal continuous text; suppose we examine the rest of the alphabet, and find out what the frequencies would be for other letters. Beginning at H, and calling letters in the order a, b, c, we find that this fifth alihabet, provided it is the H-alphabet, will contain: 3 d’s, 7 e’s, 2 h’s, 2 l’s, 2 o’s, 3 r’s, 3 t’s, and 3 y’s. That is, each letter present which shows a frequency greater than 1 will represent some plaintext original which, normally, is of some frequency, the only exception being y, which is a vowel. This is the best we can expect of any columnar frequency count made on only 34 letters; but more convincing still, and more reliable, is the fact that out of the entire group j k q x z we find only x, represented once. Alphabet 5, then, is entirely acceptable as the H-alphabet of the Vigenère cipher.

Let us see what we can find out about alphabet 3. Here, the strongly predominant letter is U. But when we attempt to identify this as e, we find that we should have to accept an alphabet containing 3 q’s, 2 x’s, and 3 z’s, all occurring in only thirty-odd letters of text. We meet with similar trouble when we attempt to identify U as t, as a, as o, and so on. It is not until we try it as s that we have good luck, finding only a series of blanks to represent the letters b, j, k, q, v, w, x, and z. And if U represents s, this alphabet begins at C. Alphabet 3, then, is entirely acceptable as the C-alphabet of the Vigenère cipher, and we have two of the key-letters: * * C * H.

In alphabet 1, the leading letter, N, is not so strongly predominant, and yet, when we assume it as the substitute for e, we find that the rest of the count is satisfactory. Alphabet 1, then, is acceptable as the J-alphabet of the Vigenère cipher and we have three of the key-letters: J * C * H.

In alphabet 2, we find no one leading letter, but the two most prominent frequencies are standing opposite E and S, as if this count might represent the normal alphabet itself. The absence of O and the presence of only one T is hardly significant in a columnar count; but further examination shows an excess of M’s and W’s, and this is more disturbing. However, a single K has appeared as the only representative of the group J K Q X Z; the low-frequency letters B and V have appeared but once each; and there is an absence of Y’s to counterbalance those which were too numerous in one of our other alphabets. So that a detailed examination, and the failure to identify this as any other alphabet, will lead to its tentative acceptance as the A-alphabet of the Vigenère cipher. (We can know definitely when we attempt to decipher with it.) With alphabet 2 accepted as the A-alphabet, we now have four of the key-letters: J A C * H. We shall return in a moment to consider the one which is still missing; but according to those present, it does not look as if our key is going to develop into a recognizable word.