embedded software boot camp

Replacing nested switches with multi-dimensional arrays of pointers to functions

Monday, March 17th, 2014 by Nigel Jones

It’s been way too long since I’ve written a blog post. To those kind souls that have written to inquire if I’m still alive and kicking – thank you.  The bottom line is that there simply aren’t enough hours in the day. Anyway in an effort to get back in the groove so to speak, I thought I’d answer an email from Francois Alibert who wrote to ask how to replace a nested switch statement with a multi-dimensional array of pointers to functions. Here’s a program that illustrates his conundrum:

#include <stdint.h>
#include <stdio.h>

typedef enum 
{State1, State2, State3, Last_State}
MainState_t;

typedef enum 
    {SubState1, SubState2, SubState3, SubState4, SubState5, SubState6, Last_SubState}
SubState_t;

void demo(MainState_t State,  SubState_t SubState);

/*     Functions called from nested switch statement. 
    First digit is main state, second digit is substate */

void fn11(void);
void fn16(void);

void fn24(void);

void fn32(void);
void fn33(void);
void fn35(void);

void main(void)
{
    MainState_t main_state;
    SubState_t sub_state;
    
    for (main_state = State1; main_state < Last_State; main_state++)
    {
        for(sub_state = SubState1; sub_state < Last_SubState; sub_state++)
        {
            demo(main_state, sub_state);
        }
    }
}

void demo(MainState_t State,  SubState_t SubState)
{
    switch (State)
    {
        case State1:
            switch (SubState)
            {
                case SubState1:
                fn11();
                break;
                
                case SubState6:
                fn16();
                break;
            
                default:
                break;
            }
        break;

        case State2:
            switch (SubState)
            {
                case SubState4:
                fn24();
                break;
    
                default:
                break;
            }
        break;
    
        case State3:
        {
            switch (SubState)
            {
                case SubState2:
                fn32();
                break;
                
                case SubState3:
                fn33();
                break;                
                
                case SubState5:
                fn35();
                break;
    
                default:
                break;
            }
        }
        break;
        
        default:
        break;
    }
}

void fn11(void)
{
    puts("State 1, substate 1");
}

void fn16(void)
{
    puts("State 1, substate 6");
}

void fn24(void)
{
    puts("State 2, substate 4");
}

void fn32(void)
{
    puts("State 3, substate 2");
}

void fn33(void)
{
    puts("State 3, substate 3");
}

void fn35(void)
{
    puts("State 3, substate 5");
}

The key points are that we have nested switch statements and the substate is sparse. That is the number of substates for main state 1 is different to that of the substates for main state 2 and so on. If you’ve ever been in the situation of having to write a nested state machine like this, you’ll rapidly find that the code becomes very unwieldy. In particular functions many of hundreds of lines long with break statements all over the place are the norm. The result can be a maintenance nightmare. Of course if you end up going to three levels, then the problem compounds. Anyway, before looking at a pointer to function implementation, here’s the output from the above code:

State 1, substate 1
State 1, substate 6
State 2, substate 4
State 3, substate 2
State 3, substate 3
State 3, substate 5

In addition, using IAR’s AVR compiler, the code size with full size optimization is 574 bytes and the execution time is  2159 cycles, with the bulk of the execution time taken up by the puts() call.

Let’s now turn this into a pointer to function implementation. The function demo becomes this:

void demo(MainState_t State,  SubState_t SubState)
{
    static void (* const pf[Last_State][Last_SubState])(void) = 
    {
        {fn11, fnDummy, fnDummy, fnDummy, fnDummy, fn16},
        {fnDummy, fnDummy, fnDummy, fn24, fnDummy, fnDummy},
        {fnDummy, fn32, fn33, fnDummy, fn35, fnDummy}
    };
    
    if ((State < Last_State) && (SubState < Last_SubState))
    {
        (*pf[State][SubState])();
    }
}

Note that the empty portions of the array are populated with a call to fnDummy(), which as its name suggests is a dummy function that does nothing. You can of course put a NULL pointer in the array, and then extract the pointer, check to see if its non-NULL and call the function, However in my experience its always faster to just call a dummy function.

So how does this stack up to the nested switch statements? Well as written, the code size has increased to 628 bytes and cycles to 2846. This is a significant increase in overhead. However the code is a lot more compact, and in my opinion dramatically more maintainable. Furthermore, if you can guarantee by design that the parameters passed to demo() are within the array bounds (as is the case with this example), then you can arguably dispense with the bounds checking code. In which case the code size becomes 618 bytes and the execution time 2684 cycles. It’s your call as to whether the tradeoff is worth it.

 

20 Responses to “Replacing nested switches with multi-dimensional arrays of pointers to functions”

  1. Emerson says:

    I’m glad to know that you are back.

  2. Lundin says:

    Nice to see you back & that the whole embeddedgurus site is seemingly springing back to life 🙂

    Regarding the topic of function pointer jump tables, I would like to point out the importance of using a lot of defensive programming. Because if you have a single bug in them, the whole program will go haywire. To use the last enum member of the states as array size for the jump table, as shown in Nigel’s example, is very good practice. As is the boundary checking of whether the enum is in range.

    Be aware though, that C enums use an inconsistent type system. An enumeration constant State1 is guaranteed to have type signed int, which is not necessarily of the same type as an enumerated type MainState_t. If you think that sounds too stupid to be true, see C11 6.7.2.2. This is a flaw in the C language and a potential source of bugs in embedded systems, where compilers often implement enumerated types as equivalent to uint8_t. (A MISRA-C static analyser tool would find such bugs, while manual code review is less likely to do so.)

    For the sake of readability, it may also be wise to hide away the obscure function pointer syntax behind typedefs. Here’s an almost identical example for a state machine, that I wrote not long ago:

    http://electronics.stackexchange.com/questions/95569/best-practice-to-keep-main-in-embedded-systems/96415#96415

    Though I’m not so sure if a multiple dimension array is the best way to implement such a state machine. The most likely scenario in an object-oriented design, is that the sub states are internal for each state. For example, in a “SPI communication state”, the caller couldn’t care less about whether the SPI driver is idle, sending or receiving, as long as it is doing its job. In that case the sub states should exist as static “private” variables inside the state, and get handled internally.

    • Nigel Jones says:

      Thanks for the welcome back. Defensive programming is important. The original article I wrote on pointers to functions has a fair amount of description on defensive techniques when using function pointers, and also suggests using typedefs to hide the declaration complexity.

    • gauthier says:

      I am interested in the difference in signedness of enum constants and enum types! I knew the constants were signed int, and I have lacked the ability to decide the signedness of the type. I assumed (wrongly, apparently) that the type was also signed, which bugged me when doing boundary checks: it feels like a shame to check if a variable is both under MAX *and* positive, when the need for positiveness would have been removed by making the type unsigned. I have even considered creating a separate typedef for the type associated with the enum, although this removes the type checking advantage.

      Talking about typedef, don’t you think that `void (*) (void)` is common enough to be a known idiom? It’s used only for the array declaration, I don’t really see the need of typedefing it away.

      • Lundin says:

        Regarding the need to typedef away void(*)(void), C speaks for itself. Compare these lines:

        func_t function (func_t p);

        and

        void(*function (void(*p)(void)))(void);

        Are you really certain you don’t want to use typedef after all? 🙂

        • gauthier says:

          Sure, that’s a good example. It depends on the usage. If the only usage you have is that described in the article (one array of void (*) (void)), I don’t see it as confusing enough to typedef it.

  3. gauthier says:

    Nice to have you back!
    This situation (although I’ve only used it for one-dimensional arrays) is one where C99 helps a lot, with named initializers. It’s much harder to associate a function with the wrong index, and the indexes that are not mentioned are initialized to NULL (which would favor checking for non-NULL before dereferencing, which I always do anyway):

    static void (* const pf[Last_State][Last_SubState])(void) =
    {
    [State1] = {
    [SubState1] = fn11,
    [SubState6] = fn16
    },
    [State2] = {
    [SubState4] = fn24
    },
    [State3] = {
    [SubState2] = fn32,
    [SubState3] = fn33,
    [SubState5] = fn35
    }
    };

    Maybe you could try this and check size and execution time?
    (let’s see if code formatting in comments has improved).

  4. Jeff says:

    This is a good technique that everyone should understand – thanks for the post.

    Greater code size due to the empty functioning pointers makes sense but any thoughts on why execution time goes up? Indexing into the array is a pretty simple operation.

    • Nigel Jones says:

      I strongly suspect that IAR’s optimizer is working out that the array is so sparse that some sort of nested if is more efficient. I suspect that with a much larger state machine the PtoF would compare much more favourably.

      • Julian Day says:

        I suspect that the execution time is slower because of all the extra calls to fnDumy(). If you compared to NULL instead, then it would probably be faster by avoiding the overhead of a function call for each dummy state, as there are more transitions to dummy states than useful transitions. However, in a real system, I’m not sure it would be realistic to have such a high proportion of transitions to ‘dummy’ states.

        The bigger advantage of this is that since it is a table lookup, the time per execution of demo() is likely to be close to constant, which isn’t much less likely to be the case for nested switch statements.

  5. Miro Samek says:

    I’m sorry to be such a party pooper, but the example code is not very good. I understand that the purpose here is just to illustrate the use of function pointers, and state machines are definitely the “killer app” for it. (I’ve even devoted the Section 3.7.1 of my book “Practical UML Statecharts in C/C++, 2nd Ed.” to discussing the role of pointers to functions in state machine implementations.)

    But the state machine examples with or without pointers to functions are rather bad and I would *not* recommend to use any such code in real projects.

    So, what exactly I dislike about the example (the last one with pointers to functions)?

    First, the code seems to be inspired by the venerable state-table technique, but is confusing because the second dimension is used by “sub-states” (instead of events). The use of sub-states would suggest some sort of state hierarchy, but this is certainly not a hierarchical state machine. So what is it?

    Second, presumably there are some state transitions in this state machine, but none of the functions illustrates how to achieve a transition (with a global variables “state” and “substate” ?).

    Third, the technique in the original post requires enumerating the “states” (and “substates”!). Better techniques don’t require enumerating states (only events). For example, every state can be mapped to a function (state-handler function). Then a single pointer-to-function “state variable” can always point to the currently active state. A state transition in this case simply changes this pointer-to-function. This avoids the need to enumerate states.

    Anyway, while I definitely agree that state machines are the best application for pointers to functions, the actual implementation matters.

    • Nigel Jones says:

      Actually Miro, the point of the post was to answer a question I’d received about how to replace nested switch statements with a sparse 2-D array of function pointers. Having said that, as Lundin alluded to in an earlier comment, I agree this isn’t necessarily the best way to implement a state machine, and in particular a hierarchical state machine.

    • gauthier says:

      I would even take out the second dimension and build a totally separated state machine (I suppose that is the kind of things Lundin meant with his object-oriented comment).

      Interesting point about storing a pointer to function instead of a table index! I have used the table-based state machine (very much like Nigel’s example) in the past, and sanity-checked the index value as well as non-NULLness of the function pointer at that index. I am not very sure why, but it seemed a good idea to check if the state variable (static linkage, not a “real” global) was within boundaries. You cannot easily do that with storing only the function pointer, can you?

      Again, I’m not sure why I worry the function pointer would get an erroneous value, but if it does then it’s hard to detect. You could argue that if it could, then why couldn’t the content of the function pointer array, or the state variable within its boundaries…

      It is often useful to do something special on the entry of a state. How do you detect that, do you compare the previous function pointer value to the current one, and if it’s different you got an entry? Or do you create an additional entry state for all states?

  6. Larry says:

    I’ve used this technique a few times. However, I’ve found that it becomes even more useful when expanded upon.
    I create a structure:
    {
    { state1, fptr1 },
    { state2, fptr2 }
    { ……. , ….. },
    { NULL, errFctnPtr }
    };

    Then the code will search the table for a match on state.
    The code then contains a loop indexing through this table comparing the state against the state entries in the table.
    If there’s a match, then run the function. If it gets to the NULL entry, then execute the errFctn.

    This makes the code very simple and modular. It makes it easy to read. Any new states can be added
    to the table very easily. This works especially well for handling communications,
    where the state is the received command.

  7. Salvatore Ragucci says:

    Sorry if off topic, and the article is an interesting exercise, but until somebody learns me better it seems to me that discussions about state machines need go no further than the Quantum Platform components. Correct, document-able, maintainable, understandable, portable, affordable, recognizable, orthogonal, and fun.

  8. Ashleigh says:

    I have mixed feelings on this.

    Having written many, many state machine based pieces of code (especially communication protocol implementations and drivers) over the years, my preference is to always prefer obvious over compact / efficient.

    Sometimes Obvious means you use nested switch statements, simply because a reading of the code makes the execution conditions, input conditions, etc…. kinda obvious. When adding / modifying / maintaining code, obvious things tend to be understood faster – even if less conceptually elegant.

    One of the other benefits of switch statements is that the legal values and bounds are checked; you can use default clauses, so sanity checking of values, parameters, etc is easy. (And in a state machine, EVERY switch statement should have a default clause that allows recovery in the event of insanity.)

    All embedded systems – but especially those that run for long periods of time – should handle crazy values that are illegal – single event upsets are real and will cause (very) infrequency changes of variables in RAM – by perhaps only a single bit. So state variables should be bounds checked and a recovery process should be coded in every case.

    Use of function pointers makes the bounds + sanity checking process far more difficult. The table of static function pointers in most embedded system would be placed into ROM / Flash, and thus is not affected. So what should be worried about is the index into that table – which should always include checking and sanity recovery. Even then, the extremely rare runaway due to single event upsets is still possible.

    My experience of this kind of thing is that in many (most?) cases, the code can be refactored in some way that removes, reduces, or simplifies the nested switches. And that approach, whilst keeping things obvious, and avoiding function pointers is probably the better way to go. It may require more thought – but like most highly skilled jobs, that’s what we are here for. (also known as: if it were easy, everyone would be doing it.)

  9. Marcel says:

    Some notes:
    1. Function pointers are with no doubt a very useful mechanism.

    2. Replacing switches with function pointer tables (the concepts is called lookup table…) may have some advantages in some situations but it can bring a lot of disadvantage as well (code size, readability, robustness).

    3. the examples above are not state machines at all – nevertheless the source code suggests it (“state, substate”) Whats missing is a mechanism which ties states and events together to transitions. The above examples only call certain predefined functions depended on two parameters. Thats all.
    The usual way to encode state transitions into a 2-dimensional table which maps every possible input state with every possible output state to an action results in huge ROM memory footprints if there are more than some trivial states. The memory complexity is O(n^2). which is not a good idea for micro controllers with limited flash ROM sizes.

  10. Brendan says:

    The `_t` suffix for typenames is reserved by POSIX (when on such platforms).

Leave a Reply

You must be logged in to post a comment.