embedded software boot camp

Setting a bad example – part 5

Wednesday, August 11th, 2010 by Nigel Jones

This is the fifth part in a series of  articles in which I’m ‘highlighting’ what I consider to be lousy code provided by IAR in their NXP LPC1764 development kit. This code excerpt is taken from usb_t9.c in the AudioDevice project.

/*************************************************************************
 * Function Name: UsbCoreReq
 * Parameters:  UsbCoreReqType_t Type
 *
 * Return: Int32U
 *
 * Description: Return device states
 *
 *************************************************************************/
Int32U UsbCoreReq (UsbCoreReqType_t Type)
{
 switch(Type)
 {
 case UsbCoreReqDevState:
   return(UsbDevCtrl.State.DS);
 case UsbCoreReqDevSusState:
   return(UsbDevCtrl.State.Suspend);
 case UsbCoreReqConfiquration:
   return(UsbDevCtrl.Configuration);
 case UsbCoreReqInterface:
   return(UsbDevCtrl.Interfaces);
 case UsbCoreReqDevOwnAddress:
   return(UsbDevCtrl.DevAdd);
 case UsbCoreReqWakeUpEnableStatus:
   return(UsbDevCtrl.Feature.RemoteWakeupEnable);
 case UsbCoreReqSelfPoweredStatus:
   return(USB_SELF_POWERED);
#if USB_HIGH_SPEED > 0
 case UsbCoreReqSpeed:
   return(UsbDevCtrl.HighSpeed);
#endif // USB_HIGH_SPEED > 0
 }
 return((Int32U) -1);
}

I suspect that on first blush many readers will not be too offended by this function. However it contains several things that are pet peeves of mine. I’ve listed them in the order that they occur in the function

  1. As with most of the code, I found the function header to be almost useless. Having read it multiple times and also having looked at the code, I’m still not really clear what the purpose of this function is other than to provide access to some of the parameters of the structure UsbDevCtrl.
  2. This function does nothing more than switch on a parameter and return a value. As I outlined in this post, IMHO one should use multiple simple functions rather than one large function such as this. The resulting code is smaller, faster and the purpose of each function is much clearer. It also avoids the problem that is discussed in point 7 below.
  3. The author has used the variable name ‘Type’. While I know this is legal and it may even be somewhat meaningful in this case, I find the idea of using ‘Type’ as a variable name just plain confusing.
  4. The function contains no-less than nine return statements. Most coding standards require that a function have a single exit point. I have to admit that if you’d asked me ten years ago whether I thought this was a good idea I’d have probably equivocated. However in recent years I have really made an effort to ensure that all my functions have a single exit point – and I have found it has improved my code a lot. Of course there are the odd cases where it really is unavoidable. However in this case, maintaining a single exit point would have been trivial.
  5. The argument to the return statement is in parentheses. This seems to be a popular construct, which I simply do not understand. What benefit do the parentheses confer? To those that ask what harm are they doing, then I would suggest that they make the code harder to read.
  6. The switch statement has no default clause. Yes, one will fall through to the last return statement – but it’s a sloppy way of handling it.
  7. The pièce de résistance is of course the cast of -1 to an unsigned integer. The function is declared as returning an unsigned integer – but in one case it returns a signed integer that is cast to unsigned. I’m sure that you will not be surprised to hear that none of the functions that call UsbCoreReq() check to see if “-1” has been returned.

I don’t think this example is quite as horrific as some of the others I have highlighted – but it’s still pretty bad. As always your comments make this interesting!

33 Responses to “Setting a bad example – part 5”

  1. John Regehr says:

    Hi Nigel– Make sure you start picking on vendors other than IAR soon. Goodness known I’ve seen real gems in just about everyone’s sample code :).

  2. […] Setting a bad example – part 5 « Stack Overflow.   Leave a […]

  3. Gauthier says:

    I suppose that the case values are constants? Maybe defines? They are not all caps, is this legitimate?

    About the number of return points, the argument talking for multiple returns – that I’ve seen most often – is for input parameter check.

    Error_code_type my_function(int32u input_parameter)
    {
    if (input_parameter > INPUT_MAX) {
    return ERROR_CODE;
    }

    // continue with with your function
    }

    Strangely enough I started to switch in the other direction than you did (from only one return point to several).
    But of course, wild returns in the middle of an algorithm flow is a bad bad thing.
    In this example you avoid the need for more indentation levels.

    Error_code_type my_function(int32u input_parameter)
    {
    if (input_parameter <= INPUT_MAX) {
    // continue with your function. The code is indented at one more level than before.
    // And if you need to have more complex checks, you might to indent even more.
    } else {
    return ERROR_CODE;
    }
    }

    In the example that you mentioned, I wonder if the author wanted to spare a memory slot by not having a temporary variable. This is something an optimiser should take care of.
    I know that the last "C" compiler I used would not, it would actually use a static RAM address for every variable in a function. I guess it's not the case with IAR tools.

    Another thing: what if the members of UsbDevCtrl start changing types? Say if UsbDevCtrl.Configuration becomes a signed word?

    • Gauthier says:

      (you need a way to write code in the comments, it seems)

    • Nigel Jones says:

      I’m not sure what motivated the author to write this particular function this way. The parameter checking argument I have heard before – and it’s a reasonable one. I tend to tackle it by writing code that looks something like this
      uint32_t return_code = 0;
      if (par1 > LIMIT1)
      {
      return_code |= (1 << 0);
      }
      if (par2 > LIMIT2)
      {
      return_code |= (1 << 1)
      }

      if (par32 > LIMIT32)
      {
      return_code |= (1 << 31);
      }

      if (!return_code)
      {
      /* Proceed with the work */
      }

      return return_code;
      Not only does this solve the problem of a single return point, it also has the advantage of providing a complete report on what was wrong with the input parameters, rather than reporting on just the first problem.

      I also like your last point about the type changing. That would really screw up this function!

      • Bernhard Weller says:

        Thats a nice way of handling it, I used to write the code like Gauthier, because the indentation sometimes really screwed the readability.

        So I guess I’ll adapt this way of doing the argument checks. But I’d check if the return_code is zero, I know that C supports checking for true or false on integers, but (imho) you shouldn’t check if an integer is true. There is a boolean for that (or if there isn’t I’d typedef one, just to make clear if this is a value which holds true or false and not a number)

        Then again, the uint32 is not really a number in this case anyway.

        I just checked using the MSP430 compiler in Code Composer Studio v4 and it produced the same code using !return_code and return_code == 0.

      • snichols says:

        I like your approach to avoiding multiple return points, but I’m a little confused by your example assigning different values to return_code. Are those all supposed to be left shifts? Doesn’t (1 < 1) evaluate to zero?

      • Carroll Robinson says:

        As an aside, when I need to use a value with a single bit set to 1 and all other bits set to zero, I just usethe following definitions, which I have handy in a header file:
        #define BIT0 (1 << 0)
        #define BIT1 (1<< 1)
        #define BIT15 (1<<15)

        #define BIT16 (1L << 16)
        #define BIT31 (1L <<31)

        This approach makes the source code a bit simpler to read, IMHO, and removes the clutter of the parentheses, which are already included in the definition. And, it avoids errors in translating between hex and binary. Although not shown in your code above, some programmers might use 0x01, 0x02, 0x04, instead of the shift operations, for this type of thing.

        Some compiler vendors may define these bit constants in their header files (TI's MSP 430 header files do this).

  4. groovyd says:

    I tend to prefer to return as soon as possible in the function where it also saves indentation further on as Gauthier does. I also tend to make my return values boolean wherever possible instead of getting excited about error codes when usually the calling code really doesn’t care why it failed but just that it did. If I need a return code for other indications I tend to prefer enumerations. Using boolean returns and smart function naming tends to make the surrounding code easier to read then using codes. After considering these issues first I then try to reorganize my logic to minimize the number of returns.

    • Nigel Jones says:

      I too like using enumerations for return codes. In fact whenever I see lists of manifest constants (particularly if the values are contiguous) I always wonder why the author didn’t use an enumeration.

  5. Lundin says:

    Regarding only one return from a function (or only one break in a loop, another version of the same argument). This is enforced by MISRA, IEC 61508 and other such heavy documents. I have been arguing against both of these standards that this particular rule is a nonsense one.

    How can I!? Surely multiple returns are bad? In most of the cases, yes. Particularly when you jump out of the middle of a loop or some complex code… it will be confusing for the reader to see the flow of the function.

    However, it is impossible to write readable code without using multiple return statements. Anyone who has ever written a parser knows this. Text and protocol parsers are hardly rare applications, I have a protocol parser in almost every app I write.

    Instead of arguing I will illustrate with a very typical example of a parser. Which version do -you- think is the most readable one? The machine code generated from these two examples should be nearly identical.

    ParseResult_t parse (const SomeSortOf* received_protocol, SomeSortOf* parsed_data)
    {
    ParseResult_t result;

    /* lots of code decoding header: */
    BLA();
    BLA();
    BLA();

    if(header_is_faulty())
    {
    result = FAULTY_HEADER;
    }
    else
    {
    /* lots of code checking protocol size: */
    BLA();
    BLA();
    BLA();

    if(size_is_faulty())
    {
    result = FAULTY_SIZE;
    }
    else
    {
    /* lots of code checking checksum: */
    BLA();
    BLA();
    BLA();

    if(checksum_is_faulty())
    {
    result = FAULTY_CHECKSUM;
    }
    else
    {
    /* lots of code checking data correctness: */
    BLA();
    BLA();
    BLA();

    if(data_is_faulty())
    {
    result = FAULTY_DATA;
    }
    else
    {
    /* lots of code decoding data */
    BLA();
    BLA();
    BLA();

    result = OK;
    }
    }
    }
    }

    return result;
    }

    ParseResult_t parse (const SomeSortOf* received_protocol, SomeSortOf* parsed_data)
    {
    ParseResult_t result;

    /* lots of code decoding header: */
    BLA();
    BLA();
    BLA();

    if(header_is_faulty())
    {
    return FAULTY_HEADER;
    }

    /* lots of code checking protocol size: */
    BLA();
    BLA();
    BLA();

    if(size_is_faulty())
    {
    return FAULTY_SIZE;
    }

    /* lots of code checking checksum: */
    BLA();
    BLA();
    BLA();

    if(checksum_is_faulty())
    {
    return FAULTY_CHECKSUM;
    }

    /* lots of code checking data correctness: */
    BLA();
    BLA();
    BLA();

    if(data_is_faulty())
    {
    return FAULTY_DATA;
    }

    /* lots of code decoding data */
    BLA();
    BLA();
    BLA();

    return OK;
    }

    • Nigel Jones says:

      The “doing what the “standard” says’ versus what you believe to be correct is a real conundrum. I run in to this all the time with the offsetof() macro that is inexplicably banned by MISRA.

      • Notan says:

        Well I think the reason for banning multiple return statements inside a function is to avoid unexpected return statements in spaghetti code like the parse function above. (Spaghetti code is banned by MISRA too.)

        I know that Nigel does not like implementation functions first, but the times when I read C files from top to bottom are over, i use “Goto Declaration” in the IDE. My parse() would look like…

        static int decode_header(const SomeSortOf* received_protocol, SomeSortOf* parsed_data)
        {

        }
        static int check_size(…

        ParseResult_t parse(const SomeSortOf* received_protocol, SomeSortOf* parsed_data)
        {
        if (OK != decode_header(received_protocol, parsed_data)) return FAULTY_HEADER;
        if (OK != check_size(received_protocol, parsed_data)) return FAULTY_SIZE;
        if (OK != check_checksum(received_protocol, parsed_data)) return FAULTY_CHECKSUM;
        if (OK != check_data(received_protocol, parsed_data)) return FAULTY_DATA;
        if (OK != decode_data(received_protocol, parsed_data)) return FAULTY_DECODE;
        return OK;
        }

        This function still has many return statements, but I think by having one in every line, there will not be an unexpected one.
        Strictly conforming to “one return only” does not make it better but not too bad:

        ParseResult_t parse(const SomeSortOf* received_protocol, SomeSortOf* parsed_data)
        {
        ParseResult_t res = ASSERT;
        if (OK != decode_header(received_protocol, parsed_data)) {
        res = FAULTY_HEADER;
        } else if (OK != check_size(received_protocol, parsed_data)) {
        res = FAULTY_SIZE;
        } else if (OK != check_checksum(received_protocol, parsed_data)) {
        res = FAULTY_CHECKSUM;
        } else if (OK != check_data(received_protocol, parsed_data)) {
        res = FAULTY_DATA;
        } else if (OK != decode_data(received_protocol, parsed_data)) {
        res = FAULTY_DECODE;
        } else {
        res=OK;
        }
        return res;
        }

        • Josh says:

          I would also like to point out that in the example where you use one return statement, every check will happen on a “packet”, but only the last error message will get passed to the caller. At the best case, you’ve already wasted cycles checking for states that the caller will never see, unless they fix the data themselves and rerun the routine to get another error code.

          But on the other hand, delayed return would allow for a more comprehensive report on the correctness of your data if you managed to pass back more than one error code.

          Forgive me if I’m wasting time analyzing a contrived example and this was already understood.

  6. Lundin says:

    …and of course I didn’t manage to preserve any indention in this form. Seems html “pre-tags” won’t work. With proper indention it becomes much clearer which version is the most readable one.

  7. Tony Gray says:

    I agree with the other posters that returning at the very top based on some simple check is acceptable:

    void foo(void)
    {
    if (!not_fooing_right_now) return;

    }

    On those rare occasions where placing a return in the middle of a function makes the code more readable, I always add a giant frickin’ comment with neon lights and dancing girls so everyone knows there’s an early return, something like this:

    void foo(void)
    {
    ….
    /***************/
    /* RETURN */
    /***************/
    return;
    ….
    }

  8. groovyd says:

    i don’t seem to have a problem seeing returns particularly if they are done in the as soon as possible to avoid excessive indentation way. It is very clear to me that Lundin’s second example is the way to go regardless of what some big book demands. Thankfully I guess I do not write code for people requiring these sort of standards.

  9. Kuldeep says:

    Hi,

    Talking about return statements: The other day I came across very strange ‘return’ statement; I have never heard nor read about such return construct ever hitherto. This statement is from a utility ‘pidin’ source code provided by famous RTOS. Can anybody explain me how this statement works; a simple example would be appreciated

    construct:
    int
    fill_status(int expectwarn, struct shared_info *i, int *tid, int fd)
    {
    status.tid = *tid;
    if (devicectrl(fd, READ_PROC_TIDSTATUS, &status, sizeof status, 0) != EOK) {
    warning_exit(!expectwarn, expectwarn, “\ncouldn’t fill_status()\n”);
    return 1;
    } else
    return i->status = &status, *tid = status.tid, 0;
    }

    • Todd Blackmon says:

      Wow!

      I had to look this up.
      The return statement evaluates it’s expression, which in this case is: i->status = &status, *tid = status.tid, o
      Then it returns the result. The comma operator evaluates it’s left statement, throws the result away, then evaluates its right expression.

      So, the return statement is equivalent to:
      i->status = &status;
      *tid = status.tid;
      return 0;

      Whoever wrote this should be ashamed of themselves.

  10. Eric Watson says:

    The original code is interesting to say the least!
    One point to mention is that coding the function with one return statement will also make debugging a lot easier – especially for those debuggers where there’s a limited number of breakpoints available – you just need one BP at the end of the function and you know what is about to be returned. Personally I think coding it using a switch statement is probably the best/clearest implementation – the coder could also make sure (he may already have done so) that all case statements are in enum order to hint to the compiler to create a jump table in assembler which should be nicely optimised for speed.

    As for the return statements being in parenthesis, this is also a niggling point for me. My understanding (please correct me if this is wrong…) is that because C was originally used mainly for maths applications (or a lot of developers early on were using it for maths), their use of parenthesis was in-keeping with mathematical notation, and it just stuck. Personally, I see no need and it’s simply not required. A lot of programmers have probably continued to use it because it’s in a vast amount of existing code/examples/books.

    • Nigel Jones says:

      The point you make about debugging is an excellent one Eric. My worked on a project a few years back where one had (I think) just 2 breakpoints available. Perhaps it was that project that gave me religion concerning the number of exit points!
      You also make an interesting point concerning putting parentheses around the argument to return. I always find it distressing when someone does something ‘because that’s the way it’s done’. I’ve always thought that my primary job is thinking…

    • Bernhard Weller says:

      I found the use of parenthesis in a return statements rather distracting. In this case it’s most likely harmless, but it just looks like a function call on first glance. If there is poor syntax highlighting like here, the difference between
      switch (anything)
      {
      case x:
      return(something);
      case y:
      rerun(something_else);
      }
      is easily lost. Well of course naming a function rerun would most likely be bad coding style, nevertheless I think return without parenthesis makes the point much clearer.

      • Sometime/sometargets (Cortex-M3+GCC) breakpoint can be placed at last brace of function, breaking on all returns (at least i didn’t discovered “under the hood” use of more than 1 breakpoint.

        I have once read that multiple returns are very similar to using gotos – they do not allow loops but sometime program flow less readable.

  11. Michael Burns says:

    A single point of exit makes it easier to instrument source code with debug tracing which may be needed, for example, when the target does not support a source code debugger. This applies for entry/exit points and/or for error cases. In my experience, debug tracing has often to be added to existing embedded code.

    Great articles! And, well worth continuing, judging from the number, and quality, of comments.

    • Nigel Jones says:

      Thanks for the kind words Michael. The number and quality of the comments has indeed been excellent. I’m hoping that many of the visitors to this blog read the comments and learn from them. I know I do!

  12. Yes, one will fall through to the last return statement

    eh?

    I’m not a dyed-in-the wool C programmer, but I’d always assumed a failure to match any of the cases was a NOP….

    • Nigel Jones says:

      Well you certainly don’t execute any of the code contained in the ‘cases’ if that’s what you mean. For this code, since you wouldn’t match any of the cases (and thus execute any of the return statements), then execution would continue at the first statement past the switch statement – which is the return(-1);

  13. plom says:

    Do you have the same opinion about “sizeof”?
    Some people don’t use the parenthesis with
    “sizeof” with the argument that “sizeof” and “return”
    are not functions and so shouldn’t use ().

    • Nigel Jones says:

      I do use parentheses with sizeof. My understanding is that sizeof requires parentheses when its argument is a data type (as opposed to a variable). Thus sizeof(int) versus sizeof var. As a result I have got into the habit of always using parentheses with sizeof. I’d also note that sizeof is an operator whereas return is not (at least in the sense that sizeof appears in the C operator precedence table, whereas return does not). Furthermore, sizeof has high precedence. Thus sizeof var1 + var2 is not the same as sizeof(var1 + var2);

Leave a Reply to Bernhard Weller

You must be logged in to post a comment.