embedded software boot camp

Hamming distances

Monday, June 5th, 2017 by Nigel Jones

My guess is that most readers of this blog have at least a vague idea of what “Hamming distance” is.  At the most abstract level it is a measure of how different two equal length “strings” are. In the case where the “strings” are actually binary integers, then the the Hamming distance is simply the number of bit positions that differ. Thus if we are comparing 8-bit integers, then the values 0x00 and 0x01 have a Hamming distance of 1, whereas 0xFF and 0x00 have a Hamming distance of 8.

So what you ask? Well over the years I’ve noticed a recurring “problem” in the industry. The scenario is the following. A design calls for some sort of distributed processing architecture, often with a master processor and a number of slaves. These slaves communicate with the master over some sort of serial bus, and each slave has a unique address. In just about every application I have ever seen, the slave addresses are assigned sequential addresses starting at 1 (0 is usually reserved for the master or for broadcasts). So what’s wrong with this you ask? Well, if you do this you are increasing the probability that the wrong processor will be addressed. For example if two processors have addresses of 0x02 and 0x03, then a single bit flip in the LSB will cause the wrong processor to be addressed. Conversely if the two addresses are e.g. 0x55 and 0xAA then you’ll need 8 bit flips to cause the wrong processor to be addressed. In short, you can minimize the probability of incorrect addressing by maximizing the Hamming distance of the addresses.

Now before you rush to the comment section to tell me that you always protect your address bytes with a block check character or a CRC, go and read this article I wrote a few years back (or more to the point read the referenced article by Phil Koopman). Unless you have chosen the optimal CRC, the chances are you aren’t getting the level of protection you think you are. Even if you are using a good CRC, employing the concept of maximizing the Hamming distance essentially comes for free and so why not use it?

I’ll assume for the sake of argument that I’ve convinced you that this is a good idea. Now if you have just two processors on the bus, then it’s easy to choose addresses that give a Hamming distance of 8. For example if the first processor has an address of A, then just assign the second processor an address of A XOR 0xFF. E.g. 0x55 and 0xAA.

However, what happens when you have three processors on the bus? In this case you want to maximize the Hamming distance between all three. The questions now become:

  1. What is the maximum Hamming distance obtainable when choosing three unique values out of an 8 bit byte?
  2. What are example values I can use?

Now I’m sure that some mathematician (Hamming?) has worked out a generalized solution to this problem. In my case I wrote a quick program to explore the solution space and discovered that the maximum Hamming distance achievable between three 8-bit values is 5. An example is {0xFF, 0xE0, 0x1C}. To see that this is the case: 0xFF XOR 0xE0 = 0x1F (1.e. five ones); 0xFF XOR 0x1C = 0xE3 (i.e. five ones); 0xE0 XOR 0x1C = 0xFC (i.e. 6 ones).  Note that the fact that the last pair yields a Hamming distance of 6 is nice, but does not alter the fact that the minimum of the maximum Hamming distances is 5.  FWIW my program tells me that there are something like 143,360 different combinations that will yield a Hamming distance of 5 for this particular set of requirements.

What about when you have four devices on the bus? In this case the minimum of the maximum achievable Hamming distance drops to 4. An example is {0xFF, 0xF0, 0xCC, 0x0F}.  For five devices the minimum of the maximum Hamming distance is also 4. An example is {0xFF, 0xF0, 0xCC, 0xC3, 0xAA}

So to summarize. With an 8 bit address field, the maximum Hamming distance achievable by device count is:

  • Two devices. Maximum Hamming distance = 8.
  • Three devices. Maximum Hamming distance = 5.
  • Four devices. Maximum Hamming distance = 4.
  • Five devices. Maximum Hamming distance = 4.

Beyond 5 devices I’ll need a better algorithm than what I’m using to find a solution. However given that in most of my work I rarely go beyond 4 devices I’ve never felt the need to put the effort in to find out. If anyone out there has a closed form solution or an efficient algorithm for determining any arbitrary solution (e.g. I have 13 devices sitting on a CANBus which uses 11-bit identifiers.  What addresses should I assign to maximize the Hamming distance), then I’d be very interested in hearing about it.

Note that I have concentrated here on address bytes. The same principle applies to any other parameter over which you have control. Typical examples are command bytes and also enumerated values. In short, if a typical message that you send looks something like this: <STX> 01 00 00 02 00 [BCC] <ETX>, then try applying the Hamming distance principles such that a typical message looks more like <STX> FC 73 00 42 D9 [BCC] <ETX>. Not only will you get more protection, you’ll also find that the messages are easier to interpret on a protocol analyzer since you aren’t dealing with a sea of zeros and ones.

Finally, as in most things in engineering, there is another way of looking at the problem. See this article I wrote a few years back.

10 Responses to “Hamming distances”

  1. Bruno says:

    I haven’t heard this definition before, but I am very familiar with the idea.

    But Why maximize? Instead of using optimized hamming distance within a short bitfield I think a better option is to increase the message size and enforce the use a good hamming distance, unless of course you really have speed requirements (which usually is not relevant for control operations)

    By the way, nice to see activity in the blog.

    • Nigel Jones says:

      I suppose the answer is that given you have a constraint (e.g. an 8-bit address) then what’s a free (i.e. zero cost) methodology for increasing your reliability? Maximizing your Hamming distance is an answer. Although I didn’t mention it in the article, you can of course get quite a lot of bang for your buck by using parity bits.

  2. Ben says:

    Why not start with a dense range of numbers, then apply a standard error correcting code, like Reed Solomon? Wouldn’t you end up with more or less the same thing?

    • Nigel Jones says:

      You would end up with a similar type result. However ECC such as RS carries quite a bit more processing overhead. If you need an ECC code anyway, then I’d be inclined to drop the Hamming technique. However if you’re adding ECC in order to avoid increasing the Hamming distance then I’d say you are making a poor choice.

  3. versat says:

    0xAA right shifted is 0x55, so while there is a great Hamming Distance, it could be misinterpreted because of a delay or so.
    I saw this with Logic Analyzers and where the signal is not perfect (for the Analyzer).
    This might be a special case and if one has such problems the communication should be improved anyway.
    But when one sees this for the first time it might be a bit confusing until you realize the real problem.
    IMHO if you use 0xA5 and 0x5A, the Hamming Distance is still the same and one could better see if there is a shift.
    Is there a name for the idea to find numbers in a range with maximum Hamming Distance, that are not easily misinterpreted by inversion, shifting, …?

    • Nigel Jones says:

      You are quite right. Hamming codes maximize ones protection against random events and not systematic events such as a constant phase error (i.e. a shift). I have no idea what one would call a set of numbers that has a large Hamming distance whilst simultaneously having other desirable noise properties such as low correlation with other members of the set. Perhaps a good approach might be to start with a sequence of numbers that has minimal correlation and then select numbers from the sequence that have a maximum Hamming distance. In the complex domain, Zadoff-Chu sequences produce numbers that have excellent correlation properties. Anyone know how to generate a series of real integers that have excellent correlation properties?

  4. Ashleigh Quick says:

    If you use a maximal Hamming distance combined with parity, it’s quite easy to implement a crude error correction coding system without the overhead or complexity of Reed-Solomona and so on.

    For example, you might code up only a few values. An example I’ve used before springs to mind – 4 values, coded into say 4 bits.

    In the case where there is a parity error AND you then go on to examine the bits of the encoded value, you can then determine a correction based on calculating the actual Hamming distance (which is a simple XOR and count the bits operation). This does not give you a guarantee of correction, but it is highly likely to correct for single bit errors (bit flips).

    Most of the time this can be a silly thing to do, but for example in the header fields of communication protocols it can be a very good thing, as often header fields appear before any higher level checksum or CRC has kicked into the protocol definition (Zigbee I’m looking at you).

    Such cases can then allow a better error recover process. Depends on what you are looking for. But in general where you don’t have any other form of protection on your transmission, use of Hamming coding (and sensible receive processing) will save you a lot of pain later.

  5. Graham Bartlett says:

    I was just Googling for examples of maximal Hamming distances, for exactly the application you suggest, and I found this page.

    I do have to pick you up on the values here, though. Whilst these are optimal for detecting a few bits flipped, we have other “popular” forms of data corruption too, most notably all bits stuck-at-0 or stuck-at-1 which are what you’re most likely to see with badly-synchronised data or a faulty comms link. So whilst 0xFF might be a valid choice for IDs based on Hamming distance between other IDs, other factors make it a pretty bad choice. Any competent coder should be able to recreate your test program, of course, so they can work out a family of distance-N IDs which don’t include 0x00 or 0xFF.

Leave a Reply

You must be logged in to post a comment.