embedded software boot camp

#include “includes.h”

Thursday, September 9th, 2010 by Nigel Jones

I am sure that the title of this blog posting is familiar to most of the readers of this blog, in that you have opened up a C source file and found a single #include statement that references a file that is typically called ‘includes.h’. On opening ‘includes.h’ one invariably finds an enormous list of other header files. Furthermore, as you go through other source files in the project, you will find that they all use ‘includes.h’. I suspect that by this point, the readers of this blog are divided into camps, namely:

  1. Either: So what, I do it all the time because it makes my life a lot easier.
  2. Or: I want to scream whenever I see this done.

I’m one of the screamers – and this is my rationale.

Back in the dark ages when one had to compile on computers with extremely limited resources, the compilation time of a module was a major issue. One of the things that significantly affected compile time was the number and the size of the header files that a module opened. As a result, most of us took steps to ensure that we only included the header files that were needed. However, as processor speeds increased and compilers started using pre-compiled header files, this became less of an issue, such that today I seriously doubt if you’d notice much difference in compilation times regardless of the number of header files that are included. I don’t know but I suspect that this was the enabler that caused people to start using ‘includes.h’.

So if compilation time is no longer an issue, what’s the big deal? After all we have all had the hassle of compiling a file only to be told that we are missing a prototype or a data type. At which point we have to hunt down the requisite header file, include it and recompile. If you do this half a dozen times in a new module, then it takes you say 15 minutes before everything is OK – and who has 15 minutes to waste on such irritating details? Well, in my opinion it’s time well spent. Here’s my case:

Coupling Indication

The number of header files a module needs to use is a crude but effective indicator of coupling. A module that needs to include almost no header files is clearly a module that is extremely self contained. That is it isn’t relying upon the outside world. Modules like this are typically easier to maintain and also more immune from changes made elsewhere in the system. In short I like modules that don’t have to include a lot of header files. Indeed when I have finished writing a module, I take a look at its include list. If the list is long then it really makes me wonder whether I should be breaking the module apart in some way so as to reduce the degree of coupling between it and the outside world.

Maintenance – understanding coupling

This is related to the first point. If I need to do some maintenance on a module, then a quick look at the include list can tell me how this module interacts with the rest of the code. This can be extremely useful if one is trying to understand how a program is put together.

Maintenance – understanding functionality

If I look at the include list and I see ‘math.h’, then I know that the module is using transcendental functions, which in turn implies complex floating point operations, which in turn implies potentially long execution times. In a similar manner, if it includes the header for the hardware interrupt handler, then I know I’m dealing with something related to the chip. I can get all this sort of information in a two second scan of the include list.

Documentation

If you use an automated documentation tool such as Doxygen, then only including the header files that are needed by a module ensures that Doxygen generates a meaningful documentation set for you, rather than including hyperlinks to useless files.

Not getting what you want

I have left what is probably the biggest problem to last. By including an enormous number of header files you lay yourself wide open to problems like this:

Header1.h

#define FALSE 0
#define TRUE  !FALSE

Header17.h

#ifndef FALSE
#define FALSE 0UL
#define TRUE  1UL
#endif

Header26.h

#pragma(IGNORE_REDEFINITION_OF_MACROS)
#define FALSE NULL
#define TRUE !FALSE
#pragma(ERROR_ON_REDEFINITION_OF_MACROS)

Trust me when I tell you I have seen this done! In other words the more files you include, the more likely it is that the macro that you are blithely using does not in fact have the value you think it does. Time to debug problems such as these – a lot longer than 15 minutes!

Remedial Action

On the off chance that I have convinced an ‘includes.h’ fan of the error of their ways, it would be remiss of me to not tell you how to quickly find out just the header files needed by a module.

  1. Paste the include list of includes.h into the module.
  2. Delete the entry for includes.h
  3. Compile the code to make sure you haven’t broken anything.
  4. Lint the file. Lint will tell you all the header files that aren’t being used.
  5. Delete the unnecessary include statements.
  6. Repeat from step 3 until Lint is happy.

Of course the chances are that if you use ‘includes.h’ you aren’t using Lint. If you do start using Lint then it will do a lot more for you than just telling you about unnecessary includes.

28 Responses to “#include “includes.h””

  1. Lundin says:

    Very good post, this is more important than people think. I’m definitely one of the “screamers” as well. Having a sloppy “includes.h” is the easiest way of introducing “tight coupling” in your program, ie dependencies between code modules.

    Tight coupling is without doubt something that must be avoided in all projects, each code module should be autonomous as far as possible. There is even scientific research proving that tight coupling is one of the main reasons for catastrophic failures in embedded software, particularly when it occurs in complex programs.

    The way of creating autonomous code modules that Nigel describes is fundamental object-oriented design. In my opinion, OO design has very little to do with “class” keywords, constructors, template programming and other such fluff. The uttermost important part of OO is private encapsulation/isolation of the code into autonomous objects.

    Also, if you only #include the things a particular code module actually needs, you will be able to reuse your code module in other projects without any changes made to it. Quite handy.

  2. Bernhard Weller says:

    I also use single includes in my projects, for instance I wrote a little CRC routine for serial transmission, completely capsuled in one source file and one header file, needing only a include to a type definition header. So if I need a CRC module I can just take two files, check for right datatypes and paste it in every other project I’m working on, as it’s mostly used in serial transmission, the UART module there will include my crc.hpp and everything is fine.

    My main routine doesn’t use any CRC calculation at all, why should I include it there? I think it really gets more clear what a module does or what it needs if you have a include list at the beginning rather than including just about everything everywhere so that you don’t have to worry in each module what you actually need. It’s just time saved on the wrong thing which will really get you if you actually have a problem with your includes or you try to port a module to somewhere else.

  3. GroovyD says:

    alot of times people use them when interfacing with large lower level apis or libraries where they do not want to worry about where the specific declarations of the functions they are using live but just that they are somehow coming from that library, include the library header and go.

    i can see the advantages of both, but not so sure I would put one way ‘better’ then the other.

  4. Gauthier says:

    I think nearly all of your readers are in the screamer category, Nigel.

    I would like to point out a problem I have had, using a proprietary assembly language that featured a C like include system.
    Since it was assembly there were no inline keyword, and to achieve its effect the only solution was to write macros for the preprocessor to replace (thankfully the preprocessor had a better macro declaration way that simple #defines).
    The side effect of that was that these macros were to be located in an include file (for I’d never #include an s file, or c file!)

    Then what to do when you had to define a macro (in an include file) that needed another macro in another include file? That’s right, include the include file from within the first include file.
    I really disliked it, but that’s the only practical way I could find.

    I actually put the #include statement inside the macro, so that the statement would be read by the preprocessor only if the macro was used. This meant having the #include statement in the middle of the file, which is suboptimal. But I preferred it to unnecessary coupling.
    Any opinion?

    • Nigel Jones says:

      It is a tough call. I have had to resort to similar techniques in the past. I think the key thing is that it’s done as a last resort – and not as the first.

      • Gauthier says:

        Another related observation on including h files in other h files:
        When your function declarations (in the header file) need stdint.h, where do you include it? Do you include stdint.h in your header file, or do you rely on the calling c file to have included it for you? You could arguably include it in the compiler command line, but that’d include it for every compilation… which might be ok in the specific case of stdint.h, but the principle feels bad.

        • Nigel Jones says:

          This is actually part of a more general issue as to what to do when a header file requires another header file for type definitions. There is no great solution for this. However, this is what I do. If a header file requires the use of another common header file (such as stdint.h) then I do *not* include the required file. Instead I rely (require!) that the required header file be manually included in the C file. In practice this isn’t much of a burden because the C file usually needs to include that file anyway (particularly in the case of stdint.h). If a header file requires another very obscure header file, then I will conditionally include it, reasoning that the C file is unlikely to include the obscure header file, and that someone (me :-)) is unlikely to know what header file is required. Where it gets difficult is when a header file is neither common nor obscure. In this case I come down strongly on not including the header file – for the reasons listed in the blog post. Doing so occasionally slows me down in that I have to work out which header file I need to add to support the header file I really want to use. However this is usually a 2 minute task – which pails by comparison of the potential problems caused by automatically including header files.

          • Gauthier says:

            I see. But I fail to see the point of putting the include under a conditional, as in your example below:

            #ifndef FOO_H
            #include “foo.h”
            #endif
            Of course FOO_H must be defined in foo.h

            I expect foo.h to have the multiple include protection:

            #ifndef FOO_H
            #define FOO_H

            #endif // FOO_H
            “, in which case the conditional you quoted seems unnecessary (it is already in foo.h itself). The only difference I can see is that foo.h is opened in one case, and not the other. Is that what you are aiming at?

            Also, I wonder in which cases including in the h file is more a problem than requiring the client c file to include a necessary h file.

          • Nigel Jones says:

            The major advantage of doing this is that it stops Lint from complaining about an include file being used multiple times. It’s also marginally faster on compilation.

    • Lundin says:

      If I’d scream when spotting tight-coupling, I’m going to yell my lungs out when spotting function-like macros. My opinion is that they should be avoided at all costs. Personally I think of them as the most hideous piece of code I can ever write, even tight coupling is a minor problem in comparision.

      Not only are they poorly defined in the standard with multiple cases of undefined/implementation-defined behavior, error-prone and painful to debug, they also make the code unreadable/unmaintainable and likely unportable. Using function-like macros to -increase- readability is just laughable, I won’t event comment that. The only use for such macros is inlining. And how often do you actually have to use inlining?

      Using inlining to save program memory is a really poor argument today, even on the tiniest <$1 8-bit MCUs there is plenty of flash. And most often inlining -increases- program memory, as it excludes code re-usage.

      So the only real use for function-like macros is inlining to save the extra execution time coming from function calls.

      And only you are writing C90, because C99 or C++ have inling. And only if the code must be portable and the compiler doesn't provide a compiler-specific way of inlining (smart compilers provide a portable way through #pragma).

      And only when the program has a function that must be inlined at all costs because it is called from -plenty- of very performance-critical places of the code, so many places that you couldn't just type out the code instead of using a function. Typing out the code would otherwise be the best solution, because if you ever face such an extreme situation, readability be damned, you'll only wish the code to do its intended purpose in time.

      I work almost solely with "hard" realtime apps in resource limited MCU applications, and I can't think of one single occasion in the past where I were actually forced to use inlining. In my opinion, inlining with function-like macros is just one of those "obfuscate for performance" things, where the programmer decides to make their program an unreadable, unmaintainable mess to save 1 clock tick.

      • Ashleigh says:

        I”d also scream… the point about coupling is really important in firmware over a certain size / complexity. My rough rule of thumb is that more than about 4 code units and its something you should be taking into account.

        As for function-like macros – there are 2 cases where I use these, as a last resort:

        1. In a Hardware Abstraction Layer – where for one platform its perfectly ok to have a thing that looks like a function but is actually a macro – for example, a hardware neutral reading of some buttons. On another platform you might need to implement a row / column matrix key reader, or in Windows you might need to grab the state of a button on the screen that was clicked. A function-like macro allows the embedded device that can go fast to do so w/o the function call overhead; and on the devices where you need a function call you can have one and the LOOK of the API does not change. I’m aware there is an argument here to just always use a function call anyhow – but sometimes performance really does matter.

        2. For some operations on peripherals – for example, suppose I use a capture / compare block to generate a PWM waveform. I can either put through my code exciting things like TACCR0 |= OUTMOD_2 (which requires me to THINK HARD when I read the code later); or I can define a macro #define M_SET_OUTPUT_SET_GO_HIGH { TACCR0 |= OUTMOD_2; } and then use that in the code. At least when I read the code I know what it is supposed to do. Mind you that #define better not ever be exported!

        Neither of these things are especially perfect ways to go. They just make the subsequent code maintenance a little easier, and they promote a little more consistency throughout the code as well – for example, using that macro several times to get the same result can actually lead to less buggy code because silly mistakes can be avoided (eg doing that operation 3 times…)

        • Gauthier says:

          Another example in the peripheral case.
          Say you are coding an I2C driver. You want the data pin to be either a low output, or a high impedance.

          i2c_set_high: set the GPIO as input (hiZ)
          i2c_set_low: set the GPIO dir as output and GPIO level as low.

          It would be a pain and a decreased readability to include the i2c_set_low code everywhere you need to set a pin as low level.
          Depending on the architecture, the code itself takes as much memory as the call instruction to a function, so there is no memory to be saved in a function. Extra CPU cycles to lose though.
          If you make a mistake (setting the direction and output value in the wrong order for example), you have to modify only at one place.

      • Jason Malarkey says:

        I’ve been running into issues with how a couple of compilers handle inline, in situations where the extra few instructions for a function call really do matter. While the compilers are certainly following the C99 standard, their behavior makes their inline behavior useless for my needs. The compilers inline just fine for the first X uses of an inline function. Once an inline function is called X+1 times, the compilers stop inlining and make it a function call. Since this project is more of a source library, things work fine in my test harness, with a low number of calls to the inline functions, then fail miserably when used in real applications. X is not documented by the compiler vendor, and doesn’t have an option to set it. This occurs regardless of other optimization options. In this case, I am forced to use macros to ensure the code works correctly.

        • Nigel Jones says:

          Some compilers allow you to force a function to be inlined. For example IAR has a pragma inline=forced that solves the problem you are talking about. Of course this is only useful if all the compilers you need to support have such a feature.

  5. Miro Samek says:

    The uC/OS-II RTOS from Micrium uses the common kitchen-sink include file called “includes.h” and Micrium also recommends this design practice in their published C coding standard. Micrium claims superior quality of their source code. Any comments?

    • Nigel Jones says:

      Very interesting. Does Micrium explain why they think this is good practice – or do they merely state it as if it is self evident?

      • Adam Scheuring says:

        When I saw the title I was pretty sure that uC/OS-II triggered you to write this article. I totally agree with you Nigel! When I started to use uC/OS for my own purpose, the first thing was to restructurize this include concept, however as it is part of their coding standard, every port they contribute inherits this bad practice, and it hasn’t changed in the new uC/OS-III either.
        To answer your question, I would paste the text from the book (I hope I’m not violating any copyright rules as this page is available as a Google Book):

        “INCLUDES.H is a master include file and is found at the top of all .C files. INCLUDES.H allows every .C file in your project to be written without concern about which header file is actually needed. The only drawbacks to having a master include file are that INCLUDES.H might include header files that are not pertinent to the actual .C file being compiled and that the compilation process might take longer.”

        http://books.google.com/books?id=exHUsQoEgD4C&lpg=PP1&dq=microC%2FOS-II%20%20book&pg=PA341#v=onepage&q=microC/OS-II%20%20book&f=false

        I don’t want to seem like I’m joining a conversation to brag about a particular product but this example shows that even commercial product vendors can prioritize “convenience” higher than source code quality.

        One more thought I would add to this: as it is often said, source code is read more than its written, and correct modularization (that includes the correct include concept for me) helps to read and LEARN the code better. This is even more true when you are dealing with a complex product such as a protocol stack or an operating system.

        • Nigel Jones says:

          Very well put Adam. I must say I’m flabbergasted by Micrium’s cavalier attitude to this.

        • Lundin says:

          In my experience, software like open source RTOS or protocol stacks, are as far from state of the art as possible. As they want to support every target imaginable, the code typically turns extremely ugly, filled with pre-processor junk.

          Just look at listing 14.4 in that link… obfuscated, shortened variable names, mysterious macro calls, shift operations on signed type operands, plenty of reliance on implicit integer promotions, hardcoded numeric constants… The code isn’t horrible, but it is surely not state of the art either. It is something that some average embedded-Joe would write.

          • Ian Johns says:

            As a disclaimer, I develop for Micrium, so my reply is biased.

            ‘includes.h’ is ugly, but it is convenient, as Mr. Scheuring points out. For a small system or module with tight coupling, I suppose we’ve convinced ourselves that it’s convenience outweighs its ugliness. On a larger system, I might not wish to use a single ‘includes.h’, but I might group certain modules/components together & use various ‘includes.h’-type headers where appropriate. I won’t claim that this approach is superior, just that it is convenient & perhaps not the worst coding faux pas.

            Rebutting Lundin’s analysis of listing 14.4 in µC/OS-II book:
            (a) Micrium naming conventions have always followed an abbreviation & mnemonic dictionary. I’d hope that developers who consistently use their own abbreviations & mnemonics wouldn’t find our particular dictionary names that obfuscated or confusing.
            (b) Unless I am missing something, I can’t find the “shift operations on signed type operands”. I see that both number-of-places-to-shift values are signed constant integers, but the operands that are shifted are unsigned integers.
            (c) I’ll grant you the mysterious macro call & magic #s. However, this listing was for a specific x86 BSP port. I suppose the thought process was that port code didn’t need to be as high-level as the core system code.

            Also to be fair, this listing is 12+ years old. Since then, some of us more OCD types at Micrium have worked to improve the coding standard so that our code base becomes more consistent, readable, & maintainable. However, I’ll always be the first to admit that the process of improvement should never stop — which implies that there’s always going to be someone else that can teach us a thing or two.

          • Nigel Jones says:

            Thanks for taking the time to make this comment Ian. It’s always good to hear an opposing view – particularly from a manufacturer on the receiving end of criticism.

  6. Jason Malarkey says:

    I’ve been reconsidering my stance to the whole “includes.h” concept recently. I personally think it is a bad practice, but given the skill level of the average developer, I have started to wonder if it isn’t a better way to go from a support standpoint. Educating the developers using some of the stuff I am working on is clearly not working.

  7. Laz says:

    OK, I’ll be the naysayer. Here are the system files that are in my “C_includes.h” (ADSP-21368):

    #include “std.h”
    #include “math.h”
    #include
    #include
    #include
    #include “Cdef21368.h”
    #include “sru21368.h”
    #include “signal.h”
    #include

    In addition, I have a file for Constants (TRUE, FALSE, _PI_, CLK_SPEED, etc), and another for my own Prototypes.

    In at least 3 projects that I have inherited, I have found multiple definitions for _PI_ and multiple different definitions for a prototype. One extension of your argument is “Why bring in the entire H file, just retype the prototype you need to get the warning to go away.”

    Yes, all bad coding practices, but I’ve seen them too often to ignore them. Using a common “includes.h” makes it much more likely that I’m always using the right libraries and protos. And if there is an error/conflict, it forces me to correct it at a system level, not patch it in one file or another.

  8. prashanjit ghosh says:

    Wonderfull tips to reduce the CODE size especially the replacement of switch with if-else block. I save around 200Bytes of precious space in one of my timer interrupt handler function by doing that.

  9. Alin Micu says:

    Great post!
    I have a quick question: Is it considered a good practice to include header files into another header?
    Thank you!

    • Nigel Jones says:

      In general – no. However in practice it is unavoidable at times. Personally I work hard to avoid doing it. When it’s necessary I usually use a construct such as
      #ifndef FOO_H
      #include “foo.h”
      #endif
      Of course FOO_H must be defined in foo.h

Leave a Reply to Nigel Jones

You must be logged in to post a comment.