Once in a while something happens that makes me realize that techniques that I routinely use are simply not widely known in the embedded world. I had such an epiphany recently concerning continued fractions. If you don’t know what these are, then check out this link.

As entertaining as the link is, let me cut to the chase as to why you need to know this technique. In a nutshell , in the embedded world we often need to perform fixed point arithmetic for cost / performance reasons. Although this is not a problem in many cases, what happens when you need to multiply something by say 1.2764? The naive way to do this might be:

uint16_t scale(uint8_t x) { uint16_t y; y = (x * 12764) / 10000; return y; }

As written, this will fail because of numeric overflow in the expression (x * 12764). Thus it’s necessary to throw in some very expensive casts. E.g.

uint16_t scale(uint8_t x) { uint16_t y; y = ((uint32_t)x * 12764) / 10000; return y; }

Our speedy integer arithmetic isn’t looking so good now is it?

What we really want to do is to use a fraction (a/b) that is a close approximation to 1.2764 – but (in this case) has a numerator that doesn’t exceed 255 (so that we can do the calculation in 16 bit arithmetic).

Enter continued fractions. One of the many uses for this technique is finding fractions (a/b) that are approximations to real numbers. In this case using the calculator here, we get the following results:

Convergents:

1: 1/1 = 1

3: 4/3 = 1.3333333333333333

1: 5/4 = 1.25

1: 9/7 = 1.2857142857142858

1: 14/11 = 1.2727272727272727

1: 23/18 = 1.2777777777777777

1: 37/29 = 1.2758620689655173

1: 60/47 = 1.2765957446808511

1: 97/76 = 1.2763157894736843

1: 157/123 = 1.2764227642276422

2: 411/322 = 1.2763975155279503

1: 1801/1411 = 1.2763997165131113

1: 3191/2500 = 1.2764

We get higher accuracy as we go down the list. In this case, I chose the approximation (157 / 123) because it’s the highest accuracy fraction that has a numerator less than 255. Thus my code now becomes:

uint16_t scale(uint8_t x) { uint16_t y; y = ((uint16_t)x * 157) / 123; return y; }

The error is less than 0.002% – but the calculation speed is dramatically improved because I don’t need to resort to 32 bit arithmetic. [On an ATmega88 processor, calling scale() for every value from 0-255 took 148,677 cycles for the naive approach and 53,300 cycles for the continued fraction approach.]

Incidentally, you might be wondering if there are other fractions that give better results than the ones generated by this technique. The mathematicians tell us no.

So there you have it. A nifty technique that once you know about it will make you wonder how you got along without it for all these years.

My question is: Why would do even do a divide when a multiply-shift-round will do you just as well and you have a HW multiplier available? I agree that the topic of continued fractions is interesting, but as it’s only useful when dividing by a constant, I don’t think it is very relevant to the embedded world. Just multiply by the reciprocal, do some shifting, round off for thoroughness, and you’re done – except in the rare case that you don’t have a HW multiplier.

Thanks for the comment Bryce. I guess I don’t see how what you advocate solves the problem. Could you elaborate?

If you’re dividing only by a constant, as in your example, it would be far faster to multiply by a constant, even if you have to wrap the higher-order multiply around an 8-bit HW multiplier.The essence is this: Re-factor your muliply-divide into one that has a power-of-two divisor, that can be performed by simple shifting operations.Using your example:First multiply your constant by 2^n to fill the bits of your multiplier operand: 1.2764 * 2^15 = 41825 = 0xA361. The resulting representation, 41825 / 32768 = 1.276397705078125 has an error of 2.3*10^-6, an order of magnitude less than your solution’s error of 2.3*10^-5. (Not that this matters much if you just want the integer part of the answer. If that’s all you want, an 8-bit multiplier might give you the correct results for all 8-bit inputs, but I’ll leave that analysis up to you.)Then do the multiply. In your case, it seems that you’re wanting the 16-bit integer result, but you could also get more precision out of the operation if you wanted. Let’s say an input of 123 x 41825 = 5144475 as an intermediate 24-bit result. The integer part of your answer is in the top 9 bits. Last, just shift out (and round) the bits of precision that you don’t want or need. If you don’t have a barrel shifter, don’t forget to shift in the shortest direction.Even with the extra logic to extend the precision of an 8-bit hardware multiplier, and without a barrel shifter, I can almost guarantee that this is gonna be a *lot* faster than anything involving an integer divide in software. It’s a cardinal rule of mine to never, ever, ever divide by a constant when you can multiply by a constant instead.

I second that, I have been searching for divisions more than once to try and reduce CPU load, on an older platform without a HW divider.

Now I understand. You are of course quite right and I appreciate the insight. Evidently I didn’t choose a great example for the usefulness of continued fractions!

I appreciate their interest, but since they require tuning for each constant value, I have a hard time imagining their usefulness unless you have hardware on which a divide is really as cheap as a multiply.

Looks like the link for continued fractions has moved. Here is the new location:

http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/Fibonacci/cfINTRO.html

Thanks Joel. I’ll update the article.