Check Digits

Check Digit
A decimal (or alphanumeric) digit added to a number for the purpose of detecting the sorts of errors humans typically make on data entry.

When human beings use numbers - whether keying them into computers, dialing them on telephones, or reading them and telling them to others --- they tend to make certain kinds of mistakes more often than others. According to Richard Hamming (Coding and Information Theory, 2e, Prentice-Hall, 1986, p. 27), the two most common human errors are:

• Interchanging adjacent digits of numbers:
67 becomes 76
• Doubling the wrong one of a triple of digits, two adjacent ones of which are the same:
667 becomes 677
J. Verhoeff (Error Detecting Decimal Codes, Mathematical Centre Tract 29, The Mathematical Centre, Amsterdam, 1969, cited in Wagner and Putter, "Error Detecting Decimal Digits", CACM, Vol 32, No. 1 (January 1989), pp. 106-110) gives a more detailed categorization of the sorts of errors humans make in dealing with decimal numbers, based on a study of 12000 errors:
• single errors: a becomes b (60% to 95% of all errors)
• omitting or adding a digit (10% to 20%)
• adjacent transpositions: ab becomes ba (10% to 20%)
• twin errors: aa becomes bb (0.5% to 1.5%)
• jump transpositions: acb becomes bca (0.5% to 1.5%)
• jump twin errors: aca becomes bcb (below 1%) [lower for longer jumps]
• phonetic errors: a0 becomes 1a [since the two have similar pronunciations in some languages, e.g. thirty and thirteen] (0.5% to 1.5%)
In the explanations above, a is not equal to b, but c can be any decimal digit.

Check Equation
An equation which all the digits in a number, including the check digit, must satisfy.

We can eliminate (or easily detect) the problem of omitting or adding digits by restricting the input field to a given number of digits if we are dealing with numbers which are fixed in format, such as credit card numbers, Social Insurance Numbers, local phone numbers, and student ID numbers.

Other errors are detected by calculating whether the check equation for a particular check digit scheme is true. The check digit is included in the equation so that it is protected against errors as well. If the equation is not true, an error is present; if it is true, there may or may not be an error.

A number of different schemes for detecting decimal number errors have been suggested, and several are in common use. In the following, five schemes are outlined, along with summaries of their strengths and weaknesses and interactive demonstrations using forms. The last scheme, due to Verhoeff (source cited above), is the strongest method and, while certainly more complex than the other schemes, is not overly difficult to program (as illustrated by the JavaScript source embedded in this page).

ISBN mod 11 Check

The International Standard Book Number (ISBN) uses a weighted code: Each digit is weighted according to its position in the number and the check digit is chosen so the weighted sum is evenly divisible by a prime number. The check digit is the rightmost digit in a 10-digit number. The digit positions are numbered 1..10 from right to left. The weighted sum is divided by 11. Since the remainder resulting from division by 11 can be any number between 0 and 10, an 'X' is used to represent a check digit of 10 if necessary.

This scheme detects any single error and the transposition of any two digits at any distance (assuming the overall number is 10 or fewer digits long).

For example, given the number

```     0   -   1     3     1      5   -   2     4     4      7  -   X
```

the check equation is

```    10x0+9x1+8x3+7x1+6x5+5x2+4x4+3x4+2x7+1x10 mod 11 = 132 mod 11 = 0
```

IBM Check

The "IBM check", which is used by MasterCard, VISA, and most other credit card companies (including the new Hudson's Bay Company cards, but not the older ones), is an even/odd weighted code. The digits in the even positions (numbering from the right) are multiplied by 2, then reduced to a single digit (if > 9) by "casting out nines" (subtracting 9, which is equivalent to adding the digits). All digits are then summed and a check digit added to make the result evenly divisible by 10.

For example, given the number

```    6   1   8   2        0   9   2   3        1   5   5   3
```

the leading 6 is doubled, giving 12, which is then reduced to 3 by adding the digits of 12 together; similarly, the 8 becomes 16 and then 7; the 0 is impervious to doubling; the 2 becomes 4; the 1 becomes 2; and the 5 in the second-last position becomes 10 and thus 1. Thus the check equation is

```    6#2 + 1 + 8#2 + 2 + 0#2 + 9 + 2#2 + 3 + 1#2 + 5 + 5#2 + 3 mod 10 = 0
```

where '#' represents multiplication with casting out nines, giving

```    3 + 1 + 7 + 2 + 0 + 9 + 4+ 3 + 2 + 5 + 1 + 3 mod 10 = 40 mod 10 = 0
```

This scheme catches all single errors and most adjacent transpositions, but not jump transpositions (such as 553 becoming 355) or 09 becoming 90.

Electronic Funds Transfer Routing Number Check

The check digit scheme used on routing numbers for Electronic Funds Transfer (EFT) between banks uses a 9-digit number with position weightings of 3, 7, and 1. The check equation for a number a1a2a3a4a5a6a7a8a9 is

3a1 + 7a2 + a3 + 3a4 + 7a5 + a6 + 3a7 + 7 a8 + a9 mod 10 = 0

This scheme is based on the fact that multiplication modulo 10 yields a permutation of all 10 decimal digits if the multiplication factor is one of the digits 1, 3, 7, or 9, but only a subset of the decimal digits if the factor is 5 or an even digit, as illustrated in the following table:

Multiplication modulo 10
0 1 2 3 4 5 6 7 8 9
1 0 1 2 3 4 5 6 7 8 9
3 0 3 6 9 2 5 8 1 4 7
7 0 7 4 1 8 5 2 9 6 3
9 0 9 8 7 6 5 4 3 2 1

This scheme cannot detect adjacent transpositions of digits that differ by 5.

UPC Check

The Universal Product Code (UPC) is similar to the IBM check, but uses a weighting factor of 3 (instead of 2) for the digits in the even positions (counting from the right, including the check digit).

It shares the weakness of the previous scheme: overlooking adjacent transpositions of digits that differ by 5.

Verhoeff's Dihedral Group D5 Check

Verhoeff proposed a scheme which avoids the weakness of the preceding three schemes in failing to detect some adjacent transpositions due to using addition modulo 10. His solution is based on multiplication in the dihedral group D5, which is not commutative (i.e., a*b is not always equal to b*a). The following table shows the result of multiplying i by j in D5:

0   1   2   3   4   5   6   7   8   9

Verhoeff's check equation is of the form

f1(a1) * f2(a2) * . . . * fn(an) = 0

where multiplication is over D5 and f1, f2, . . . , fn are permutations of the ten decimal digits. Verhoeff found that the special case where fi is the ith iteration of a fixed permutation f yielded an excellent check:

F[0, j] = j, 0 <= j <= 9
F[1, j] = [1, 5, 7, 6, 2, 8, 3, 0, 9, 4], 0 <= j <= 9
F[i, j] = F[i - 1, F[1, j]], 2 <= i <= 7

Note that the function F cycles with period 8 (i.e., F[8, *] = F[0. *]).

The full table for function F is thus:

0   1   2   3   4   5   6   7   8   9

If the check digit is appended at position 0 in the number (i.e., as the rightmost digit if the positions are numbered from right to left, beginning with 0), then the check digit is calculated as the inverse (in D5) of the result of the successive multiplications

F1(a1) * F2(a2) * . . . * Fn(an)

where

Inv[i] = [0, 4, 3, 2, 1, 5, 6, 7, 8, 9], for 0 <= i <= 9.

Verhoeff's check equation catches all single errors, all adjacent transpositions, over 95% of twin errors, over 94% of jump transpositions and jump twin errors, and most phonetic errors.

The following form allows experimentation with Verhoeff's check digit scheme. To calculate a Verhoeff check digit, enter a decimal number in the first box below, then click the Compute button. The correct check digit will then be appended at the right end of the number entered.

To test that the scheme catches the sorts of errors claimed, try changing selected digits, then clicking the Check button. The result of the check will be displayed in the lower text box.

Note that the scheme does not catch most jump twin errors involving digits with a difference of 5, such as 050 vs. 505, 161 vs. 616, 272 vs. 727, and 494 vs. 949, but it does catch 383 vs. 838.