One of the interesting aspects of being an embedded systems consultant is that I get to look at a lot of code written by others. This can come about in a number of ways, but most commonly occurs when someone wants changes made to an existing code base and the original author(s) of the code are no longer available. When faced with a situation such as this, it is essential that I quickly get a sense for how maintainable the code is – and thus how difficult changes will be. As a result I have developed a few techniques to help me assess code maintainability which I thought I’d share with you.
After installing the code on my system, the first thing I do is run SourceMonitor over the code. SourceMonitor is a free utility that computes various metrics. The metrics and their values for a typical code base of mine are shown below.
Number of Files: 476
Lines of code: 139,013
% branches: 6.3%
% comments: 41.7%
Average statements / function: 11.8
Max Complexity: 158
Max depth: 9+
Average depth: 0.54
Average complexity: 2.38
Probably the only thing that needs explanation is ‘complexity’. The author of SourceMonitor is not computing the McCabe complexity index, but rather is computing complexity based upon Steve McConnel’s methodology. The details of the implementation aren’t particularly important to me, as I’m more interested in comparative values.
While SourceMonitor helps give me the big picture, it is nowhere near enough – and this is where it gets interesting.
The next thing I look at are the optimization levels being used. These can be very revealing. For example if a high level of optimization is being used for the debug build then it might be indicative that a non-optimized build either will not fit into the available memory, or possibly that the code doesn’t run fast enough unless optimization is turned on. Either is indicative of a system that is probably going to be tough to maintain. Conversely if the release build doesn’t use full optimization then I take this to mean that the code probably doesn’t work when optimization is turned on. I have written about this in the past and consider this to be a major indicator of potential code quality problems.
Having looked at the optimization levels, I then perform a grep on the code base looking for the number of instances of ‘volatile‘ and ‘const‘. If the number of instances of volatile is zero (and it often is) and the optimization level is turned way down, then it’s almost certain that the author of the code didn’t understand volatile and that the code is riddled with potential problems. Whenever this happens, I get a sinking feeling because if the author didn’t understand volatile, then there is no chance that he had any appreciation for race conditions, priority inversion, non-atomic operations etc. In short, the author was a PC programmer.
The ‘const‘ count is less revelatory. If the author makes use of const then this is normally an indicator that they know their way around the compiler and understand the value of defensive programming. In short I take the use of const to be very encouraging. However, I can say that I have known some excellent embedded systems programmers who rarely used const, and thus its absence doesn’t fill me with the same despair as the absence of volatile.
Incidentally in my code base described above, there are 53 incidences of the use of ‘volatile’ (note that I have excluded compiler vendor supplied header files which define all the various hardware registers as volatile). There are also 771 incidences of the the use of const.
Regular readers of this blog will know I am a big fan of the ‘static‘ qualifier. Static not only makes for safer and more maintainable code, it also makes for faster code. In fact, IMHO the case for static is so overwhelming that I find its absence or infrequent use a strong indicator that the author of the code was an amateur. In my example code base, static appears 1484 times.
Regular readers of this blog also know that I am not a big fan of the case statement. While it has its place, too often I see it used as a substitute for thought. Indeed I have observed a strong inverse correlation between programmer skill and frequency of use of the case statement. As a result, I will usually run a grep to see what the case statement frequency is. In my example code, a case statement occurs 683 times, or once every 90 statements.
All of the above ‘tests’ can be performed without compiling the code. In some cases I own the target compiler (or can download an evaluation copy), in which case I will of course attempt to compile the code. When I do this I’m looking for several things:
- An absence of compiler warnings / errors. Alan Bowens has written concisely and eloquently on this topic. The bottom line – compilation warnings in the release build are a major issue for me. Note that I’m more forgiving of compiler warnings in the debug build, since by its nature debug often ignores things such as inline commands, which can generate warnings on some compilers.
- The compilation speed. Massive files containing very large functions compile very slowly. They are also a bear to maintain.
- The final image size. This is relevant both in absolute terms (8K versus 128K versus 2M) and also in comparison to the available memory. Small images using a small percentage of the available memory are much easier to maintain than large images that nearly fill the available memory.
The final test that I perform only rarely is to Lint the code base. I do this rarely because quite frankly it takes a long time to configure PC-Lint. Thus only if I have already created a PC-Lint configuration file for the target compiler do I perform this step. Previously un-linted code will always generate thousands of warnings. However, what I’m looking for are the really serious warnings – uninitialized variables, indexing beyond the end of an array, possible null pointer dereferences etc. If any of these are present then I know the code base is in bad shape.
I can typically run the above tests on a code base in an hour or so. At the end of it I usually have a great idea of the overall code quality and how difficult it will be to modify. I would be very interested to hear from readers that are willing to perform the same tests on their code base and to publish the results. (Incidentally, I’m not trying to claim that my metrics are necessarily good – they are intended merely as a reference / discussion point).