I’ve been programming microcontrollers for about 25 years now – and can count on one hand the number of times I’ve needed to compute the square root of an integer. This curious drought came to an end recently when I needed to compute the Crest Factor of the line voltage being used to power a product I was designing. (For the uninitiated / rusty out there, Crest Factor is the ratio of the Peak : RMS of a waveform. For example, A sine wave has a CF of 1.414, whereas a square wave has a CF of 1.000).

Why, you might ask, do I need to compute the CF? Well, the product uses triacs to control a number of AC loads. If the system is inadvertently powered from a square wave inverter, or just a really lousy generator, then the triacs will not self-commutate – and I could never turn off the loads. Thus to prevent this unfortunate scenario, I need to know how good (i.e. sinusoidal) the line voltage is. The CF is a direct figure of merit that allows me to make this decision.

Evidently, the computation of CF requires one to compute an RMS voltage, which in turn requires one to calculate the square root of a number. For various reasons, I need to compute the CF on a mains cycle by cycle basis – and I’m using a 7.37 MHz ATmega CPU. Thus, the computational efficiency of the algorithm is important.

Now IAR has a nifty little algorithm that computes an approximate square root. See http://supp.iar.com/Support/?note=18180&from=search+result

However, this gets blown away by the algorithm described by Crenshaw in his wonderful book: Math Toolkit for Real-Time Programming, CMP Books. ISBN 1-929629-09-5.

The code in his book is for computing the square root of a 32 bit unsigned integer. I adapted it to give the square root of a 16 bit integer. Here’s the code:

static inline uint8_t friden_sqrt16(uint16_t val) { uint16_t rem = 0; uint16_t root = 0; uint8_t i; for(i = 0; i < 8; i++) { root <<= 1; rem = ((rem << 2) + (val >> 14)); val <<= 2; root++; if (root <= rem) { rem -=root; root++; } else { root--; } } return (uint8_t)(root >> 1); }

This will compute the exact square root of a 16 bit integer in about 268 clock cycles on an AVR – i.e. in about 33 microseconds on an 8 MHz AVR processor.

To Crenshaw’s point – don’t just blindly use the code, but endeavor to understand how it works. Only then will you see it for what it truly is – a work of art. Thanks Jack.

Nigel,I stumbled across your blog recently and would like to compliment you on your “embedded” outlook. Your posts are always focussed and refreshing.Please keep them coming.-R.

What is the operator? :

rem = ((rem 14));

Sigh. The post killed the operator in question. It was the ”. In Basic, that’s equivalent to !=; but what is that operator in C?

OK. I hate post modification. I hope you can figure which operator by context.

Stupid post editors.

Ian:

You weren’t going mad. This post was originally written using Google’s blogger. A few months ago we switched to WordPress. However the automated blog converter clearly has a few holes in it as it completely mangled the offending line. Anyway I have updated the post so that hopefully it makes a bit more sense now.

Ah OK, I see the modification in the post (which now makes more sense) AND I now see how WordPress removes all text between a ‘less-than’ & ‘greater-than’ sign. What kind of blog control is that?!

Now that we’ve got that handled, I originally looked at this post because I was forwarded the following square root PDF : http://www.worldserver.com/turk/computergraphics/FixedSqrt.pdf .

Although I am good at math, what that really means is that I’m now old enough & smart enough (& lazy enough) to realize I can’t readily compare these two algorithms. They perform somewhat similar operations but that does not guarantee that they are similar in concept.

Anyone care to compare/contrast these two approaches?

Thanks for the integer square root. It looks very good. Have you tried unrolling the loop for a tiny speedup?

I wonder whether you could avoid calculating the square root entirely in your problem. Instead of testing

CF = peak / RMS;

if (CF < threshold) shut_down();

Couldn't you calculate this?

CF² = peak² / RMS²;

if CF² < threshold²) shut_down();

/* Those characters that WordPress will inevitably mangle are superscript 2's, e.g., CF squared. */

That replaces one square root with one multiply. (You can precompute threshold².) It may also require dividing bigger integers, depending on the range of your data.

Good questions Bob. I haven’t tried to manually unroll the loop. With 8 iterations there is a reasonable chance that the compiler did the unroll for me with full speed optimization turned on; however I haven’t checked and the computation was fast enough for me as is. You raise an excellent point about avoiding the square root and comparing CF^2 to some limit. This is the sort of technique I like to use a lot. However in this case, it turned out that I actually needed the real CF for another part of the code. Notwithstanding that, I didn’t need the real CF for another part of the code on a cycle by cycle basis and so it is a nice potential optimization in that I could compute the true CF only when I needed it.