Posts Tagged ‘embedded’

Don’t Follow These 5 Dangerous Coding Standard Rules

Tuesday, August 30th, 2011 Michael Barr

Over the summer I happened across a brief blog post by another firmware developer in which he presented ten C coding rules for better embedded C code. I had an immediate strong negative reaction to half of his rules and later came to dislike a few more, so I’m going to describe what I don’t like about each. I’ll refer to this author as BadAdvice. I hope that if you have followed rules like the five below my comments will persuade you to move away from those toward a set of embedded C coding rules that keep bugs out. If you disagree, please start a constructive discussion in the comments.

Bad Rule #1: Do not divide; use right shift.

As worded, the above rule is way too broad. It’s not possible to always avoid C’s division operator. First of all, right shifting only works as a substitute for division when it is integer division and the denominator is a power of two (e.g., right shift by one bit to divide by 2, two bits to divide by 4, etc.). But I’ll give BadAdvice the benefit of the doubt and assume that he meant to say you should “use right shift as a substitute for division whenever possible”.

For his example, BadAdvice shows code to compute an average over 16 integer data samples, which are accumulated into a variable sum, during the first 16 iterations of a loop. On the 17th iteration, the average is computed by right shifting sum by 4 bits (i.e., dividing by 16). Perhaps the worst thing about this example code is how much it is tied a pair of #defines for the magic numbers 16 and 4. A simple but likely refactoring to average over 15 instead of 16 samples would break the entire example–you’d have to change from the right shift to a divide proper. It’s also easy to imagine someone changing AVG_COUNT from 16 to 15 without realizing about the shift; and if you didn’t change this, you’d get a bug in that the sum of 15 samples would still be right shifted by 4 bits.

Better Rule: Shift bits when you mean to shift bits and divide when you mean to divide.

There are many sources of bugs in software programs. The original programmer creates some bugs. Other bugs result from misunderstandings by those who later maintain, extend, port, and/or reuse the code. Thus coding rules should emphasize readability and portability most highly. The choice to deviate from a good coding rule in favor of efficiency should be taken only within a subset of the code. Unless there is a very specific function or construct that needs to be hand optimized, efficiency concerns should be left to the compiler.

Bad Rule #2: Use variable types in relation to the maximum value that variable may take.

BadAdvice gives the example of a variable named seconds, which holds integer values from 0 to 59. And he shows choosing char for the type over int. His stated goal is to reduce memory use.

In principle, I agree with the underlying practices of not always declaring variables int and choosing the type (and signedness) based on the maximum range of values. However, I think it essential that any practice like this be matched with a corresponding practice of always declaring specifically sized variables using C99’s portable fixed-width integer types.

It is impossible to understand the reasoning of the original programmer from unsigned char seconds;. Did he choose char because it is big enough or for some other reason? (Remember too that a plain char may be naturally signed or unsigned, depending on the compiler. Perhaps the original programmer even knows his compiler’s chars are default unsigned and omits that keyword.) The intent behind variables declared short and long is at least as difficult to decipher. A short integer may be 16-bits or 32-bits (or something else), depending on the compiler; a width the original programmer may have (or may not have) relied upon.

Better Rule: Whenever the width of an integer matters, use C99’s portable fixed-width integer types.

A variable declared uint16_t leaves no doubt about the original intent as it is very clearly meant to be a container for an unsigned integer value no wider than 16-bits. This type selection adds new and useful information to the source code and makes programs both more readable and more portable. Now that C99 has standardized the names of fixed-width integer types, declarations involving short and long should no longer be used. Even char should only be used for actual character (i.e., ASCII) data. (Of course, there may still be int variables around, where size does not matter, such as in loop counters.)

Bad Rule #3: Avoid >= and use <.

As worded above, I can’t say I understand this rule or its goal sufficiently, but to illustrate it BadAdvice gives the specific example of an if-else if wherein he recommends if (speed < 100) ... else if (speed > 99) instead of if (speed < 100) ... else if (speed >= 100). Say what? First of all, why not just use else for that specific scenario, as speed must be either below 100 or 100 or above.

Even if we assume we need to test for less than 100 first and then for greater than or equal to 100 second, why would anyone in their right mind prefer to use greater than 99? That would be confusing to any reader of the code. To me it reads like a bug and I need to keep going back over it to find the logical problem with the apparently mismatched range checks. Additionally, I believe that BadAdvice’s terse rationale that “Benefits: Lesser Code” is simply untrue. Any half decent compiler should be able to optimize either comparison as needed for the underlying processor.

Better Rule: Use whatever comparison operator is easiest to read in a given situation.

One of the very best things any embedded programmer can do is to make their code as readable as possible to as broad an audience as possible. That way another programmer who needs to modify your code, a peer doing code review to help you find bugs, or even you years later, will find the code hard to misinterpret.

Bad Rule #4: Avoid variable initialization while defining.

BadAdvice says that following the above rule will make initialization faster. He gives the example of unsigned char MyVariable = 100; (not preferred) vs:


#define INITIAL_VALUE 100
unsigned char MyVariable;
// Before entering forever loop in main
MyVariable = INITIAL_VALUE

Though it’s unclear from the above, let’s assume that MyVariable is a local stack variable. (It could also be global, the way his pseudo code is written.) I don’t think there should be a (portably) noticeable efficiency gain from switching to the latter. And I do think that following this rule creates an opening to forget to do the initialization or to unintentionally place the initialization code within a conditional clause.

Better Rule: Initialize every variable as soon as you know the initial value.

I’d much rather see every variable initialized on creation with perhaps the creation of the variable postponed as long as possible. If you’re using a C99 or C++ compiler, you can declare a variable anywhere within the body of a function.

Bad Rule #5: Use #defines for constant numbers.

The example given for this rule is of defining three constant values, including #define ON 1 and #define OFF 0. The rationale is “Increased convenience of changing values in a single place for the whole file. Provides structure to the code.” And I agree that using named constants instead of magic numbers elsewhere in the code is a valuable practice. However, I think there is an even better way to go about this.

Better Rule: Declare constants using const or enum.

C’s const keyword can be used to declare a variable of any type as unable to be changed at run-time. This is a preferable way of declaring constants, as they are in this way given a type that can be used to make comparisons properly and enabling them to be type-checked by the compiler if they are passed as parameters to function calls. Enumeration sets may be used instead for integer constants that come in groups, such as enum { OFF = 0, ON };.

Final Thoughts

There are two scary things about these and a few of the other rules on BadAdvice’s blog. First, is that they are out there on the Internet to be found with a search for embedded C coding rules. Second, is that BadAdvice’s bio says he works on medical device design. I’m not sure which is worse. But I do hope the above reasoning and proposed better rules gets you thinking about how to develop more reliable embedded software with fewer bugs.

How to Enforce Coding Standards Using PC-Lint

Thursday, June 16th, 2011 Michael Barr

In an earlier blog post, I introduced the concept of automatic enforcement of coding standards by stating that:

Enforcement of coding standards too often depends on programmers already under deadline pressure to be disciplined while they code and/or to make time to perform peer code reviews. … To ensure your selected coding standard is followed, and thus effective, your team should find as many automated ways to enforce as many of its rules as possible. And you should make such automated rule checking part of the everyday software build process.

One of the tools we have found to be indispensible for this purpose is Gimpel Software‘s PC-Lint (and the multi-platform FlexeLint). At just a few hundred dollars per seat, PC-Lint happens to also be one of the least expensive tools we own and thus a source of tremendous value to our team.

The following options can be set (in version 9.00) to assist in the automated enforcement of the identified rules from Barr Group’s Embedded C Coding Standard.

 

Rule(s) Tool Configuration Notes
Which C (1.1) 975 (unrecognized pragma)

+pragma(name, action)

-pragma(name)

Braces (1.3) PC-Lint can help identify missing braces through indentation checking:

525 (negative indentation)

539 (did not expect positive indentation)

725 (expected positive indentation)

Parentheses (1.4) 665 (Unparenthesized parameter in macro is passed an expression)

773 (Expression-like macro not parenthesized)

821 (Right hand side of assignment not parenthesized)

834 (Operator ‘Name’ followed by operator ‘Name’ is confusing. Use parentheses)

973 (Unary operator in macro ‘Symbol‘ not parenthesized)

Casts (1.6) 507, 511, 519 (Size incompatibility in cast)

549, 571, 611 (suspicious cast)

643 (loss of precision in pointer cast)

680 (suspicious truncation in arithmetic expression converted to pointer)

688 (cast used within preprocessor conditional statement)

740, 741 (unusual pointer cast)

920 (cast from ‘type’ to ‘void’)

921-930 (cast from ‘type’ to ‘type’)

1773 (attempt to cast away const or volatile)

Keywords to avoid (1.7) -deprecate( keyword, auto, violates coding standard )

-deprecate( keyword, register, violates coding standard )

-deprecate( keyword, goto, violates coding standard )

-deprecate( keyword, continue, violates coding standard )

Keywords to Frequent (1.8) 818 (Pointer parameter could be declared ptr to const)

843 (Variable could be declared as const)

844 (Pointer variable could be declared as const)

952 (parameter could be declared as const)

953 (variable could be declared as const)

954 (pointer variable could be declared as pointing to a const)

765 (external symbol could be made static)

Comments, Acceptable Formats (2.1) 602 (comment within comment)
Alignment (3.2)

Indentation (3.4)

525 (negative indentation)

539 (did not expect positive indentation)

725 (expected positive indentation)

Header Files (4.2)

Source Files (4.3)

451 (header file repeatedly included but does not have a standard include guard)

766 (Include of header file not used in module)

966 (indirectly included header file not used in module)

967 (Header file does not have a standard include guard)

Fixed Width Integers (5.2)

Signed Integers (5.3)

569 (Loss of information — Integer bits to Integer bits)

570 (loss of sign, assignment being made from a negative constant to an unsigned quantity)

573 (signed-unsigned mix with divide)

574 (signed-unsigned mix with relational)

701, 703 (shift left of signed quantity)

702, 704 (shift right of signed quantity)

712 (loss of precision, one type is larger than another type)

713 (loss of precision, signed to unsigned)

734 (loss of precision, to a quantity smaller than an int)

Structure and Unions (5.5) 38 (Offset of symbol inconsistent — A member of a class or struct appears in a different position (offset from the start of the structure) than an earlier declaration)

39 (Redefinition of symbol conflicts with ‘location’ — a struct or a union is being redefined)

Function-Like Macros (6.3) 665 (Unparenthesized parameter in macro is passed an expression)

773 (Expression-like macro not parenthesized)

Initialization (7.2) 35 (initializer has side-effects)

133 (too many initializers for aggregate)

134 (missing initializer)

530, 603 (symbol not initialized)

540 (excessive size — a string initializer required more space than what was allocated)

644 (variable may not have been initialized)

651 (potentially confusing initializer)

708 (union initialization)

727, 728, 729, 738 (symbol not explicitly initialized)

771, 772 (symbol conceivably not initialized)

784 (nul character truncated from string — During initialization of an array with a string constant there was not enough room to hold the trailing NUL character)

785, 943 (too few initializers for aggregate)

940 (omitted braces within an initializer)

Switch Statements (8.3) 44 (A case or default statement occurred outside a switch)

108 (Invalid context — A continue or break statement was encountered without an appropriate surrounding context)

137 (constant used twice within switch)

408 (type mismatch with switch expression)

744 (switch statement has no default)

764 (switch statement does not have a case)

825 (control flows into case/default)

Equivalence Tests (8.6) 720 (Boolean test of assignment)

 

As a convenience to readers who follow Barr Group’s Embedded C Coding Standard and may like to use RSM and/or PC-Lint to inexpensively and automatically enforce as many of the rules as possible, here are starter configuration files:

Is “(uint16_t) -1” Portable C Code?

Thursday, June 2nd, 2011 Michael Barr

Twice recently, I’ve run across third-party middleware that included a statement of the form:

uint16_t variable = (uint16_t) -1;

which I take as the author’s clever way of coding:

0xFFFF

I’m not naturally inclined to like the obfuscation, but wondered if “(uint16_t) -1” is even portable C code? And, supposing it is portable, is there some advantage I don’t know about that suggests using that form over the hex literal? In the process of researching these issues, I learned a helpful fact or two worth sharing.

Q: Is the result of “(uint16_t) -1” guaranteed (by the ISO C standard) to be 0xFFFF?

A: No. But it’s likely the result will be 0xFFFF on most compilers/processors, since there is really just the one common internal CPU representation of unsigned integers. (For signed integers, most/all processors will use the common 2’s complement representation underneath–even though that’s not required in any way by the language standard.)

Q: Is there any advantage to writing 0xFFFF that way?

A: According to the C99 Standard, all conforming implementations support uint_least16_t, but some may not support uint16_t. If the platform doesn’t support uint16_t, then “(uint16_t) -1” won’t compile, but 0xFFFF will compile as a value of some larger unsigned integer type (i.e., a bug waiting to happen).

Of course, platforms that don’t have a fixed-width 16-bit unsigned capability are rare, though it may be that some DSPs fall into that category. The same issue applies to uint32_t and 0xFFFFFFFF, of course. However, I suspect platforms that don’t have a fixed-width 32-bit unsigned capability are even rarer.

Q: What is the best way to represent the maximum unsigned integer value of a given size?

A: The very best way to represent the maximum values for unsigned (and signed) fixed-width types is to use the constants named in C99’s stdint.h header file. These are of the form UINTn_MAX (and INTn_MAX) where n is the number of bits (e.g., UINT16_MAX). That is guaranteed to either work or not compile, with no middle ground for bugs.

Hat Tip: Many thanks to C and C++ standards guru Dan Saks for help with these answers.

How to Enforce Coding Standards (Automatically)

Wednesday, May 25th, 2011 Michael Barr

Coding standards can be an important tool in the fight to keep bugs out of embedded software. Unfortunately, too many well-intentioned (especially, corporate) coding standards are ineffective and gather more dust than followers. The hard truth is that enforcement of coding standards too often depends on programmers already under deadline pressure to be disciplined while they code and/or to make time to perform peer code reviews. And when peer reviews are done in this scenario, they too easily devolve into compliance discussions that miss the forest for the trees.

To ensure your selected coding standard is followed and thus effective, your team should find as many automated ways to enforce as many of its rules as possible. And you should make such automated rule checking part of the everyday software build process. Ideally, you would also restrict version control check-ins to just code that has passed all the automated checks. No code that breaks any of these automatable rules should be allowed in peer code reviews. That way, code reviewers can focus their limited hours on (1) what the code is supposed to do, (2) whether it does so correctly, and (3) whether the code and comments together are easily understood by all involved.

One of the ways Barr Group has found to increase compliance with its Embedded C Coding Standard is by configuring static analysis tools we already use to automatically enforce individual rules.

Perhaps the best option is to use a static analysis tool that includes built-in support for your chosen coding standard and/or the ability to be customized to proprietary rules. LDRA‘s static analysis engine, a screenshot from which is shown in the image below, comes preconfigured to check about 80% of the “Barr Group” coding standard rules, as well as a number of other widely used coding standards.

But there are other, less expensive, options available as well. Two tools that we have used are RSM and PC-Lint from M Squared Technologies and Gimpel Software, respectively. Neither can easily and independently enforce all of our Embedded C Coding Standard rules. But used together and properly configured, these two tools offer a low-cost automated coding standards enforcement mechanism that covers a large percentage of the rules.

Configuring RSM

The following RSM outputs can be examined and configuration options set (in version 7.75) to assist in the automated enforcement of the identified rules from the Embedded C Coding Standard. Pricing for the RSM tool, which runs on Windows, Linux, and other versions of Unix (including Mac OS X), is online at http://msquaredtechnologies.com/m2rsm/order/.

Rule # Brief Description Tool Configuration Notes
1.2.a Line length limited to 80 characters. Quality Notice 1
1.3.a Braces surround all blocks of code. Quality Notice 22
1.7.c Keyword goto not used. Quality Notice 9
1.7.d Keyword continue not used. Quality Notice 43
1.7.e Keyword break not used outside switch. Quality Notice 44
2.2 Location and content of comments. Quality Notices 17, 20, 51; Options -Es -Ec -EC
3.1 Use of white space. Quality Notices 16, 19
3.5.a No tab characters. Quality Notice 30; Option -Dt
3.6.a UNIX-style (single-character) linefeeds. Option -Du
4.1.a-d Module naming conventions. Option -Rn
4.2.a Precisely one header file per source file. Option -Rn
6.1.a-e Precisely one header file per source file. Quality Notice 2; Option -l
6.2 Cyclomatic complexity of functions. Quality Notices 10, 18, 27; Option -c
6.3.b.i-iii Preprocessor macro safety mechanisms. Option -m
8.2.d If-else if statements always have else. Quality Notice 22
8.3.b Switch statements always have default. Quality Notices 13, 14, 56
8.5.a No unconditional jumps. Quality Notices 9, 43, 444

Configuring PC-Lint

In a future blog post, I will similarly identify PC-Lint configurations that can be used (in version 9.0) to assist in the automated enforcement of specific rules from the Embedded C Coding Standard. Pricing for the PC-Lint tool, which runs on Windows, Linux, and other versions of Unix (including Mac OS X), is online at http://gimpel.com/html/order.htm.

Embedded Software Training in a Box

Friday, May 6th, 2011 Michael Barr

Embedded Software Training in a BoxI am beaming with pride. I think we have finally achieved the holy grail of firmware training: Embedded Software Training in a Box. Priced at just $599, the kit includes Everything-You-Need-to-Know-to-Develop-Quality-Reliable-Firmware-in-C, including software for real-time safety-critical systems such as medical devices.

In many ways, this product is the culmination of about the last fifteen years of my career. The knowledge and skills imparted in the kit are drawn from my varied experiences as:

This kit also–at long last–answers the question I’ve been receiving from around the world since I first started writing articles and books about embedded programming: “Where/How can I learn to be a great embedded programmer?” I believe the answer is now as easy as: “Embedded Software Boot Camp in a Box!”