<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
>

<channel>
	<title>Embedded Gurus - Experts on Embedded Software Posts</title>
	<atom:link href="http://embeddedgurus.com/wp-content/recent-global-posts-feed.php" rel="self" type="application/rss+xml" />
	<link>http://embeddedgurus.com</link>
	<description>Experts on Embedded Software</description>
	<pubDate>Tue, 13 Jul 2010 11:15:21 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
    			<item>
				<title>The Tragedy of Low-Flow Plumbing</title>
				<link>http://embeddedgurus.com/area-0x51/2010/07/the-tragedy-of-low-flow-plumbing/</link>
				<comments>http://embeddedgurus.com/area-0x51/2010/07/the-tragedy-of-low-flow-plumbing/#comments</comments>
				<pubDate>Tue, 13 Jul 2010 11:15:21 +0000</pubDate>
				<dc:creator>Mike Ficco</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/area-0x51/2010/07/the-tragedy-of-low-flow-plumbing/</guid>
                <description><![CDATA[I'm a middle-aged man, so you can guess what this blog is about…

Yes, that's right, bathroom fixtures!

Today we are examining the high-tech, highly political, and seriously flawed attempt at conservation by reducing the rate at which your plumbing can deliver water.

I lived in my house for over 20 years and the bathroom fixtures are showing their age.  In something resembling temporary insanity I bought a couple of new faucets.  They were chosen because they were attractive and reasonably priced.  I never even knew there was an issue with the water flow rate.  After all, we've been making water faucets since there has been indoor plumbing – why should they change now?

But change they have…

After installing my pretty new fixtures, I was disappointed to find water flows from them much more slowly than the originals.  My first thought was a manufacturing defect.  What's wrong with these things, I thought?  After some inquiry I learned I must actually thank my government for this.  Since 1994 federal regulations have placed upper limits on the water flow rates of bathroom faucets.  My research led me to the EPA website (www.epa.gov) where I was disappointed to find a classic case of government <em>Newspeak</em>.  On this website the low-flow faucets, which I initially thought were defective, are called "High Efficiency" faucets.  They may as well be called doubleplusgood faucets!

It seems "legal" bathroom faucets must conform to the EPA WaterSense® program specified American Society of Mechanical Engineers (ASME) A112.18.1 standard.  In brief this standard sets the maximum flow rate of 1.5 gallons per minute (gpm) for the common household water pressure of 60 pounds per square inch (psi).  This is, as Colonel Sherman Potter would say, "Horse Hockey" (if you are too young to understand this, look it up).

Don't get me wrong.  I'm as "green" as anybody.  I conscientiously recycle and turn water off when I'm not using it.  I absolutely agree it makes sense to re-engineer the classic design of the flushing toilet.  This makes sense with a toilet because the device autonomously uses the designed amount of water.  Faucets, however, do not.  They are – oh what's the word I'm looking for – ummm – oh yes!  Adjustable!  Legislating an upper flow rate of a trickle is not sanitation engineering, it's social engineering.  Even worse, it is misdirected social engineering.  This was done so I would use less water but that has not been the result.  Instead of a short quick blast of water to rinse my toothbrush, I have to let the water trickle onto the brush and rub my thumb back and forth.  Cleaning my twin-blade razor is even worse.  I can't blast it with water and rubbing my thumb on it doesn't seem like a good idea.

The flushing toilet was designed when the world population was about 1 billion.  It seems reasonable to revisit this design now that 5 billion more people are flushing.  The inefficiency of the WC has long been recognized.  Thrifty people back in the time of your grandparents would put a brick in a toilet's water basin so less water would be used.  However, and I'm back on my engineering quality horse again, when you ship a product it should actually work.  The first reduced water toilets were, to be frank, a piece of crap.  Many of them had to be flushed two and three times and, therefore, actually consumed more water when one considered real-world usage.

As I researched my "defective" faucets I found claims that these low-flow devices reduced water consumption by 30% or more.  Balderdash!  This isn't about the arithmetic of comparing the maximum flow of traditional and crippled faucets.  It's about social behavior.  I guarantee, based on my own behavior, water saving is minimal if any because the slow water jet is not strong enough to accomplish rapid rinsing.  Instead, the legislated trickle rinses and rinses and slowly does the job.

I applaud industry and government attempts to conserve our dwindling resources, but get it right and be realistic.  Don't create some imaginary world where there is some arithmetic relation between obstructed water flow and water savings.  MP3 became a spectacular success because it was designed around the real world of human perception.  Those who legislate the behavior of our bathroom fixtures would do well to get out of their ivory towers and consider actual human usage.

Let me say again, unlike the toilet, a faucet is adjustable – ADJUSTABLE!

Do not assume I use water with malicious intent.  These low-flow and certainly not high efficiency faucets were designed with the unwarranted assumption that the government knows better than me how to save water in my bathroom.

Bad idea.  Bad, bad idea.

If you want me to save water, tell me to save water.  If I don't save enough, penalize me in some fashion – perhaps by raising the price of water.  If water costs enough, I guarantee I can save more with personal effort than you can force me to save by crippling my faucet.

What were you guys thinking and how has this gone on for so many years?]]></description>
                <content:encoded><![CDATA[I'm a middle-aged man, so you can guess what this blog is about…

Yes, that's right, bathroom fixtures!

Today we are examining the high-tech, highly political, and seriously flawed attempt at conservation by reducing the rate at which your plumbing can deliver water.

I lived in my house for over 20 years and the bathroom fixtures are showing their age.  In something resembling temporary insanity I bought a couple of new faucets.  They were chosen because they were attractive and reasonably priced.  I never even knew there was an issue with the water flow rate.  After all, we've been making water faucets since there has been indoor plumbing – why should they change now?

But change they have…

After installing my pretty new fixtures, I was disappointed to find water flows from them much more slowly than the originals.  My first thought was a manufacturing defect.  What's wrong with these things, I thought?  After some inquiry I learned I must actually thank my government for this.  Since 1994 federal regulations have placed upper limits on the water flow rates of bathroom faucets.  My research led me to the EPA website (www.epa.gov) where I was disappointed to find a classic case of government <em>Newspeak</em>.  On this website the low-flow faucets, which I initially thought were defective, are called "High Efficiency" faucets.  They may as well be called doubleplusgood faucets!

It seems "legal" bathroom faucets must conform to the EPA WaterSense® program specified American Society of Mechanical Engineers (ASME) A112.18.1 standard.  In brief this standard sets the maximum flow rate of 1.5 gallons per minute (gpm) for the common household water pressure of 60 pounds per square inch (psi).  This is, as Colonel Sherman Potter would say, "Horse Hockey" (if you are too young to understand this, look it up).

Don't get me wrong.  I'm as "green" as anybody.  I conscientiously recycle and turn water off when I'm not using it.  I absolutely agree it makes sense to re-engineer the classic design of the flushing toilet.  This makes sense with a toilet because the device autonomously uses the designed amount of water.  Faucets, however, do not.  They are – oh what's the word I'm looking for – ummm – oh yes!  Adjustable!  Legislating an upper flow rate of a trickle is not sanitation engineering, it's social engineering.  Even worse, it is misdirected social engineering.  This was done so I would use less water but that has not been the result.  Instead of a short quick blast of water to rinse my toothbrush, I have to let the water trickle onto the brush and rub my thumb back and forth.  Cleaning my twin-blade razor is even worse.  I can't blast it with water and rubbing my thumb on it doesn't seem like a good idea.

The flushing toilet was designed when the world population was about 1 billion.  It seems reasonable to revisit this design now that 5 billion more people are flushing.  The inefficiency of the WC has long been recognized.  Thrifty people back in the time of your grandparents would put a brick in a toilet's water basin so less water would be used.  However, and I'm back on my engineering quality horse again, when you ship a product it should actually work.  The first reduced water toilets were, to be frank, a piece of crap.  Many of them had to be flushed two and three times and, therefore, actually consumed more water when one considered real-world usage.

As I researched my "defective" faucets I found claims that these low-flow devices reduced water consumption by 30% or more.  Balderdash!  This isn't about the arithmetic of comparing the maximum flow of traditional and crippled faucets.  It's about social behavior.  I guarantee, based on my own behavior, water saving is minimal if any because the slow water jet is not strong enough to accomplish rapid rinsing.  Instead, the legislated trickle rinses and rinses and slowly does the job.

I applaud industry and government attempts to conserve our dwindling resources, but get it right and be realistic.  Don't create some imaginary world where there is some arithmetic relation between obstructed water flow and water savings.  MP3 became a spectacular success because it was designed around the real world of human perception.  Those who legislate the behavior of our bathroom fixtures would do well to get out of their ivory towers and consider actual human usage.

Let me say again, unlike the toilet, a faucet is adjustable – ADJUSTABLE!

Do not assume I use water with malicious intent.  These low-flow and certainly not high efficiency faucets were designed with the unwarranted assumption that the government knows better than me how to save water in my bathroom.

Bad idea.  Bad, bad idea.

If you want me to save water, tell me to save water.  If I don't save enough, penalize me in some fashion – perhaps by raising the price of water.  If water costs enough, I guarantee I can save more with personal effort than you can force me to save by crippling my faucet.

What were you guys thinking and how has this gone on for so many years?]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/area-0x51/2010/07/the-tragedy-of-low-flow-plumbing/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Tools to help lower power consumption</title>
				<link>http://embeddedgurus.com/stack-overflow/2010/06/tools-to-help-lower-power-consumption/</link>
				<comments>http://embeddedgurus.com/stack-overflow/2010/06/tools-to-help-lower-power-consumption/#comments</comments>
				<pubDate>Wed, 30 Jun 2010 00:24:39 +0000</pubDate>
				<dc:creator>Nigel Jones</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/stack-overflow/2010/06/tools-to-help-lower-power-consumption/</guid>
                <description><![CDATA[Regular readers will know that low power designs are an interest of mine. Indeed one of the <a href="../2006/12/wanted-a-new-performance-metric/">very  first blog posts</a> I made lamented how difficult it is to ascertain  how much energy it takes to perform various tasks typical to an embedded  system. Thus it was a pleasant surprise to receive an IAR newsletter today announcing a <a href="http://www.iar.com/website1/1.0.1.0/2563/1/">tool</a> ('Power debugging') that is explicitly designed to help one lower a system's power consumption. The tool isn't available yet, but if the propaganda is to be believed it should be a very interesting adjunct to the debugging arsenal. The sign up procedure to beta test the tool doesn't seem to work properly, but on the assumption that I made it onto the beta tester list I will  post a review once I get my hands on it.

BTW I have to admit I found the name of the article / tool ('Power debugging') a bit confusing in the sense that I interpreted power in the vernacular sense (e.g. 'power walking', 'power breakfast') rather than the engineering sense. I guess I'm just a victim of so much marketing hyperbole that I can't recognize plain talk any more. Oh well!]]></description>
                <content:encoded><![CDATA[Regular readers will know that low power designs are an interest of mine. Indeed one of the <a href="../2006/12/wanted-a-new-performance-metric/">very  first blog posts</a> I made lamented how difficult it is to ascertain  how much energy it takes to perform various tasks typical to an embedded  system. Thus it was a pleasant surprise to receive an IAR newsletter today announcing a <a href="http://www.iar.com/website1/1.0.1.0/2563/1/">tool</a> ('Power debugging') that is explicitly designed to help one lower a system's power consumption. The tool isn't available yet, but if the propaganda is to be believed it should be a very interesting adjunct to the debugging arsenal. The sign up procedure to beta test the tool doesn't seem to work properly, but on the assumption that I made it onto the beta tester list I will  post a review once I get my hands on it.

BTW I have to admit I found the name of the article / tool ('Power debugging') a bit confusing in the sense that I interpreted power in the vernacular sense (e.g. 'power walking', 'power breakfast') rather than the engineering sense. I guess I'm just a victim of so much marketing hyperbole that I can't recognize plain talk any more. Oh well!]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/stack-overflow/2010/06/tools-to-help-lower-power-consumption/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Evaluating embedded code</title>
				<link>http://embeddedgurus.com/stack-overflow/2010/06/evaluating-embedded-code/</link>
				<comments>http://embeddedgurus.com/stack-overflow/2010/06/evaluating-embedded-code/#comments</comments>
				<pubDate>Sun, 20 Jun 2010 19:22:55 +0000</pubDate>
				<dc:creator>Nigel Jones</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/stack-overflow/2010/06/evaluating-embedded-code/</guid>
                <description><![CDATA[One of the interesting aspects of being an embedded systems consultant is that I get to look at a lot of code written by others. This can come about in a number of ways, but most commonly occurs when someone wants changes made to an existing code base and the original author(s) of the code are no longer available. When faced with a situation such as this, it is essential that I quickly get a sense for how maintainable the code is - and thus how difficult changes will be. As a result I have developed a few techniques to help me assess code maintainability which I thought I'd share with you.
<h2>SourceMonitor</h2>
After installing the code on my system, the first thing I do is run SourceMonitor over the code. <a href="http://www.campwoodsw.com/sourcemonitor.html">SourceMonitor</a> is a free utility that computes various metrics. The metrics and their values for a typical code base of mine are shown below.

Number of Files: 476

Lines of code: 139,013

Statements: 61,144

% branches: 6.3%

% comments: 41.7%

Functions: 2,509

Average statements / function: 11.8

Max Complexity: 158

Max depth: 9+

Average depth: 0.54

Average complexity: 2.38

Probably the only thing that needs explanation is 'complexity'. The author of SourceMonitor is not computing the <a href="http://en.wikipedia.org/wiki/Cyclomatic_complexity">McCabe</a> complexity index, but rather is computing complexity based upon Steve McConnel's <a href="http://www.amazon.com/Code-Complete-Practical-Handbook-Construction/dp/0735619670/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1277056922&amp;sr=8-1">methodology</a>. The details of the implementation aren't particularly important to me, as I'm more interested in comparative values.

While SourceMonitor helps give me the big picture, it is nowhere near enough - and this is where it gets interesting.
<h2>Optimization Level</h2>
The next thing I look at are the optimization levels being used. These can be very revealing. For example if a high level of optimization is being used for the debug build then it might be indicative that a non-optimized build either will not fit into the available memory, or possibly that the code doesn't run fast enough unless optimization is turned on. Either is indicative of a system that is probably going to be tough to maintain. Conversely if the release build doesn't use full optimization then I take this to mean that the code probably doesn't work when optimization is turned on. I have <a href="http://embeddedgurus.com/stack-overflow/2008/07/efficient-c-tips-2-using-the-optimizer/">written</a> about this in the past and consider this to be a major indicator of potential code quality problems.
<h2>C-V Qualifiers</h2>
Having looked at the optimization levels, I then perform a grep on the code base looking for the number of instances of '<em>volatile</em>' and '<em>const</em>'. If the number of instances of <em>volatile </em>is zero (and it often is) and the optimization level is turned way down, then it's almost certain that the author of the code didn't understand <em>volatile </em>and that the code is riddled with potential problems. Whenever this happens, I get a sinking feeling because if the author didn't understand <em>volatile</em>, then there is no chance that he had any appreciation for race conditions, priority inversion, non-atomic operations etc. In short, the author was a PC programmer.

The '<em>const</em>' count is less revelatory. If the author makes use of <em>const </em>then this is normally an indicator that they know their way around the compiler and understand the value of defensive programming. In short I take the use of <em>const</em> to be very encouraging. However, I can say that I have known some excellent embedded systems programmers who rarely used <em>const</em>, and thus its absence doesn't fill me with the same despair as the absence of <em>volatile</em>.

Incidentally in my code base described above, there are 53 incidences of the use of 'volatile' (note that I have excluded compiler vendor supplied header files which define all the various hardware registers as volatile). There are also 771 incidences of the the use of <em>const</em>.
<h2>Static qualifiers</h2>
Regular readers of this blog will know I am a big fan of the '<em>static</em>' qualifier. Static not only makes for safer and more maintainable code, it also makes for <a href="http://embeddedgurus.com/stack-overflow/2008/12/efficient-c-tips-5-make-local-functions-static/">faster</a> code. In fact, IMHO the case for <em>static </em>is so overwhelming that I find its absence or infrequent use a strong indicator that the author of the code was an amateur. In my example code base, <em>static</em> appears 1484 times.
<h2>Case statements</h2>
Regular readers of this blog also know that I am not a big <a href="http://embeddedgurus.com/stack-overflow/2010/04/efficient-c-tip-12-be-wary-of-switch-statements/">fan</a> of the case statement. While it has its place, too often I see it used as a substitute for thought. Indeed I have observed a strong inverse correlation between programmer skill and frequency of use of the case statement. As a result, I will usually run a grep to see what the case statement frequency is. In my example code, a case statement occurs 683 times, or once every 90 statements.
<h2>Compilation</h2>
All of the above 'tests' can be performed without compiling the code. In some cases I own the target compiler (or can download an evaluation copy), in which case I will of course attempt to compile the code. When I do this I'm looking for several things:
<ol>
	<li>An absence of compiler warnings / errors. Alan Bowens has <a href="http://codereview.blogspot.com/2010/02/zero-tolerance.html">written</a> concisely and eloquently on this topic. The bottom line - compilation warnings in the release build are a major issue for me. Note that I'm more forgiving of compiler warnings in the debug build, since by its nature debug often ignores things such as inline commands, which can generate warnings on some compilers.</li>
	<li>The compilation speed. Massive files containing very large functions compile very slowly. They are also a bear to maintain.</li>
	<li>The final image size. This is relevant both in absolute terms (8K versus 128K versus 2M) and also in comparison to the available memory. Small images using a small percentage of the available memory are much easier to maintain than large images that nearly fill the available memory.</li>
</ol>
<h2>Lint</h2>
The final test that I perform only rarely is to <a href="http://www.rmbconsulting.us/Publications/use-Lint-For-Code-Analysis.pdf">Lint </a>the code base. I do this rarely because quite frankly it takes a long time to configure PC-Lint. Thus only if I have already created a PC-Lint configuration file for the target compiler do I perform this step. Previously un-linted code will always generate thousands of warnings. However, what I'm looking for are the really serious warnings - uninitialized variables, indexing beyond the end of an array, possible null pointer dereferences etc. If any of these are present then I know the code base is in bad shape.

I can typically run the above tests on a code base in an hour or so. At the end of it I usually have a great idea of the overall code quality and how difficult it will be to modify. I would be very interested to hear from readers that are willing to perform the same tests on their code base and to publish the results. (Incidentally, I'm not trying to claim that my metrics are necessarily good - they are intended merely as a reference / discussion point).]]></description>
                <content:encoded><![CDATA[One of the interesting aspects of being an embedded systems consultant is that I get to look at a lot of code written by others. This can come about in a number of ways, but most commonly occurs when someone wants changes made to an existing code base and the original author(s) of the code are no longer available. When faced with a situation such as this, it is essential that I quickly get a sense for how maintainable the code is - and thus how difficult changes will be. As a result I have developed a few techniques to help me assess code maintainability which I thought I'd share with you.
<h2>SourceMonitor</h2>
After installing the code on my system, the first thing I do is run SourceMonitor over the code. <a href="http://www.campwoodsw.com/sourcemonitor.html">SourceMonitor</a> is a free utility that computes various metrics. The metrics and their values for a typical code base of mine are shown below.

Number of Files: 476

Lines of code: 139,013

Statements: 61,144

% branches: 6.3%

% comments: 41.7%

Functions: 2,509

Average statements / function: 11.8

Max Complexity: 158

Max depth: 9+

Average depth: 0.54

Average complexity: 2.38

Probably the only thing that needs explanation is 'complexity'. The author of SourceMonitor is not computing the <a href="http://en.wikipedia.org/wiki/Cyclomatic_complexity">McCabe</a> complexity index, but rather is computing complexity based upon Steve McConnel's <a href="http://www.amazon.com/Code-Complete-Practical-Handbook-Construction/dp/0735619670/ref=sr_1_1?ie=UTF8&amp;s=books&amp;qid=1277056922&amp;sr=8-1">methodology</a>. The details of the implementation aren't particularly important to me, as I'm more interested in comparative values.

While SourceMonitor helps give me the big picture, it is nowhere near enough - and this is where it gets interesting.
<h2>Optimization Level</h2>
The next thing I look at are the optimization levels being used. These can be very revealing. For example if a high level of optimization is being used for the debug build then it might be indicative that a non-optimized build either will not fit into the available memory, or possibly that the code doesn't run fast enough unless optimization is turned on. Either is indicative of a system that is probably going to be tough to maintain. Conversely if the release build doesn't use full optimization then I take this to mean that the code probably doesn't work when optimization is turned on. I have <a href="http://embeddedgurus.com/stack-overflow/2008/07/efficient-c-tips-2-using-the-optimizer/">written</a> about this in the past and consider this to be a major indicator of potential code quality problems.
<h2>C-V Qualifiers</h2>
Having looked at the optimization levels, I then perform a grep on the code base looking for the number of instances of '<em>volatile</em>' and '<em>const</em>'. If the number of instances of <em>volatile </em>is zero (and it often is) and the optimization level is turned way down, then it's almost certain that the author of the code didn't understand <em>volatile </em>and that the code is riddled with potential problems. Whenever this happens, I get a sinking feeling because if the author didn't understand <em>volatile</em>, then there is no chance that he had any appreciation for race conditions, priority inversion, non-atomic operations etc. In short, the author was a PC programmer.

The '<em>const</em>' count is less revelatory. If the author makes use of <em>const </em>then this is normally an indicator that they know their way around the compiler and understand the value of defensive programming. In short I take the use of <em>const</em> to be very encouraging. However, I can say that I have known some excellent embedded systems programmers who rarely used <em>const</em>, and thus its absence doesn't fill me with the same despair as the absence of <em>volatile</em>.

Incidentally in my code base described above, there are 53 incidences of the use of 'volatile' (note that I have excluded compiler vendor supplied header files which define all the various hardware registers as volatile). There are also 771 incidences of the the use of <em>const</em>.
<h2>Static qualifiers</h2>
Regular readers of this blog will know I am a big fan of the '<em>static</em>' qualifier. Static not only makes for safer and more maintainable code, it also makes for <a href="http://embeddedgurus.com/stack-overflow/2008/12/efficient-c-tips-5-make-local-functions-static/">faster</a> code. In fact, IMHO the case for <em>static </em>is so overwhelming that I find its absence or infrequent use a strong indicator that the author of the code was an amateur. In my example code base, <em>static</em> appears 1484 times.
<h2>Case statements</h2>
Regular readers of this blog also know that I am not a big <a href="http://embeddedgurus.com/stack-overflow/2010/04/efficient-c-tip-12-be-wary-of-switch-statements/">fan</a> of the case statement. While it has its place, too often I see it used as a substitute for thought. Indeed I have observed a strong inverse correlation between programmer skill and frequency of use of the case statement. As a result, I will usually run a grep to see what the case statement frequency is. In my example code, a case statement occurs 683 times, or once every 90 statements.
<h2>Compilation</h2>
All of the above 'tests' can be performed without compiling the code. In some cases I own the target compiler (or can download an evaluation copy), in which case I will of course attempt to compile the code. When I do this I'm looking for several things:
<ol>
	<li>An absence of compiler warnings / errors. Alan Bowens has <a href="http://codereview.blogspot.com/2010/02/zero-tolerance.html">written</a> concisely and eloquently on this topic. The bottom line - compilation warnings in the release build are a major issue for me. Note that I'm more forgiving of compiler warnings in the debug build, since by its nature debug often ignores things such as inline commands, which can generate warnings on some compilers.</li>
	<li>The compilation speed. Massive files containing very large functions compile very slowly. They are also a bear to maintain.</li>
	<li>The final image size. This is relevant both in absolute terms (8K versus 128K versus 2M) and also in comparison to the available memory. Small images using a small percentage of the available memory are much easier to maintain than large images that nearly fill the available memory.</li>
</ol>
<h2>Lint</h2>
The final test that I perform only rarely is to <a href="http://www.rmbconsulting.us/Publications/use-Lint-For-Code-Analysis.pdf">Lint </a>the code base. I do this rarely because quite frankly it takes a long time to configure PC-Lint. Thus only if I have already created a PC-Lint configuration file for the target compiler do I perform this step. Previously un-linted code will always generate thousands of warnings. However, what I'm looking for are the really serious warnings - uninitialized variables, indexing beyond the end of an array, possible null pointer dereferences etc. If any of these are present then I know the code base is in bad shape.

I can typically run the above tests on a code base in an hour or so. At the end of it I usually have a great idea of the overall code quality and how difficult it will be to modify. I would be very interested to hear from readers that are willing to perform the same tests on their code base and to publish the results. (Incidentally, I'm not trying to claim that my metrics are necessarily good - they are intended merely as a reference / discussion point).]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/stack-overflow/2010/06/evaluating-embedded-code/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Lowering power consumption tip #4 - transmitting serial data</title>
				<link>http://embeddedgurus.com/stack-overflow/2010/05/lowering-power-consumption-tip-4-transmitting-serial-data/</link>
				<comments>http://embeddedgurus.com/stack-overflow/2010/05/lowering-power-consumption-tip-4-transmitting-serial-data/#comments</comments>
				<pubDate>Fri, 21 May 2010 01:57:01 +0000</pubDate>
				<dc:creator>Nigel Jones</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/stack-overflow/2010/05/lowering-power-consumption-tip-4-transmitting-serial-data/</guid>
                <description><![CDATA[This is the fourth in a <a href="http://embeddedgurus.com/stack-overflow/2009/07/lowering-power-consumption-tip-1-avoid-zeros-on-the-i2c-bus/">series</a> of tips on lowering power consumption in embedded systems. For this post I thought I'd delve into the common task of transmitting serial data. I compare polling and interrupting and show you how a hybrid approach can sometimes be optimal.

Almost every embedded system I have ever worked on has contained serial links. At its most abstract level, a serial link takes in parallel data and converts it to a serial stream. This serialization inherently takes longer than the write to the register that holds the data and thus to send multiple bytes back to back there is an inevitable delay. The process thus looks like this:
<pre>Store data to be transmitted
Wait for data to be sent out
Store data to be transmitted
Wait for data to be sent out
...

<pre>Store data to be transmitted
Wait for data to be sent out</pre>
</pre>
From a power consumption perspective, the question is - how best to wait for the data to be sent out? Well, you have four basic approaches - open loop, polling, interrupting or a hybrid combination.  In assessing them from a power consumption perspective, what I look at is how many non-useful clock cycles I have to execute in order to transmit a byte of data.
<h2>Open Loop</h2>
I use the term open loop to describe a technique whereby you make use of the properties of a synchronous link to know (actually more accurately presume) that it is safe to send the next byte. This technique is only of use when the transmit frequency is very high in comparison to the CPU speed. For example, consider an SPI link  between a CPU and a peripheral. In many cases, this link may be clocked  at up to half the CPU clock frequency. In which case it takes a mere 16  CPU clocks to shift out an 8 bit datum. As a result one can simply delay 16 clock cycles between writing successive bytes. The code looks something like this:
<pre>SBUF = datum[0];
delay(16 - LOAD_TIME);
SBUF = datum[1];
delay(16 - LOAD_TIME);
...
</pre>
LOAD_TIME is a constant that takes into account the number of cycles required to get the next datum from memory and write it to SBUF. Thus the number of non-useful clock cycles per byte is (16- LOAD-TIME).

Now most of you are probably thinking that I'm nuts for advocating this approach - and I'd tend to agree with you! It's a technique I've only used a few times - and then only when I had to get the data out with the least possible latency and with the least amount of power consumed. I much prefer the next technique which can be almost as efficient - but a lot safer.
<h2>Polling</h2>
Polling differs from the open loop approach in that one polls a status register to determine when it is safe to write the next byte. This can be quite power efficient as long as, just for the previous example,  the transmit speed is very high in comparison to the CPU speed. Thus the SPI link given in the open loop example is also a good candidate for this approach. The code looks something like this:
<pre>SBUF = datum[0];
wait_for_sbuf_empty();
SBUF = datum[1];
wait_for_sbuf_empty();
...
</pre>
The key to making this approach as efficient as possible is to code the wait function so that you read the status register on the first clock after you expect SBUF will become available.  In other words you still use a pre-calculated delay, but you throw in a check of the status register just to make sure before you load the next byte. By pre-fetching the next byte to be loaded and doing some other tweaking it's often possible to get this approach almost as efficient as the open loop method. Notwithstanding these optimizations, the number of non-useful polling clock cycles will be greater than the number of CPU clocks required to transmit the data.
<h2>Interrupting</h2>
When the transmit frequency starts to slow down with respect to the CPU frequency, then the number of non-useful clock cycles quickly starts to rise if one uses a polling method. The classic example of this is of course asynchronous serial links running at standard baud rates.  In these cases, the transmit time is a large fraction of a millisecond and a polling approach consumes a huge number of CPU cycles (and hence power). The solution here is of course to turn to an interrupt driven approach. In this case the over-head of the ISR is 'non-useful' clock cycles.  As I showed in this <a href="http://www.rmbconsulting.us/minimize-interrupt-service-routine-overhead">article</a> the overhead of even a simple looking ISR can be quite significant. Notwithstanding this, for asynchronous serial links, an interrupt based approach is nearly always the most efficient.
<h2>Hybrid</h2>
The final methodology is what I term the hybrid approach. It's typically the most power efficient and is well suited to medium to fast serial links. The code for it looks like this:
<pre>SBUF = datum[0];
__sleep();
SBUF = datum[1];
__sleep();
...

__interrupt void sbuf_tx_isr(void)
{
 /* Empty */
}
</pre>
In this approach, I enable the transmit interrupt, but have no code in the interrupt handler. After each write to SBUF I execute a sleep instruction, effectively stopping op code processing. Once SBUF has emptied, it generates an interrupt. The processor vectors to the empty ISR, returns immediately and then processes the next instruction which stores the next byte in SBUF. In this case the overhead is the number of clock cycles to enter and exit sleep mode, plus the number of cycles to vector to an ISR and return. Depending upon your processor architecture this can be anything from almost nothing to quite a lot. However it is always less than a full blown interrupt handler approach and is in my experience, often less than the polling or open loop methods.

Notwithstanding the above, this method has several weaknesses:
<ol>
	<li>It should be obvious that the only interrupt that can be enabled is the SBUF transmit interrupt. (Actually it's more accurate to say that the only interrupt that can cause the processor to exit sleep mode is the SBUF transmit interrupt. The MSP430, for example, allows one to do this).</li>
	<li>While I don't consider this a kludge, it's certainly not crystal clear what is going on. Thus clear documentation is a must.</li>
</ol>
<h2>Summary</h2>
<ol>
	<li>If you feel the need for the utmost efficiency then go open loop. It's a bit like drag-racing in that it's fast, furious and undoubtedly gets you from A to B ASAP. Just don't be surprised if you blow up in the process.</li>
	<li>If open-loop isn't for you then polling may make sense provided you can crank up the transmit speed high enough. This makes for the simplest code - and that's always a plus in my book.</li>
	<li>If you have an asynchronous link, then an interrupt based approach is the right answer 99% of the time.</li>
	<li>If you have a medium to high speed link, then the hybrid approach has much to commend it. Once you've seen it done a few times it becomes less weird looking.</li>
</ol>
<a href="http://embeddedgurus.com/stack-overflow/2009/11/lowering-power-consumption-tip-3-using-relays/">Previous Tip</a>]]></description>
                <content:encoded><![CDATA[This is the fourth in a <a href="http://embeddedgurus.com/stack-overflow/2009/07/lowering-power-consumption-tip-1-avoid-zeros-on-the-i2c-bus/">series</a> of tips on lowering power consumption in embedded systems. For this post I thought I'd delve into the common task of transmitting serial data. I compare polling and interrupting and show you how a hybrid approach can sometimes be optimal.

Almost every embedded system I have ever worked on has contained serial links. At its most abstract level, a serial link takes in parallel data and converts it to a serial stream. This serialization inherently takes longer than the write to the register that holds the data and thus to send multiple bytes back to back there is an inevitable delay. The process thus looks like this:
<pre>Store data to be transmitted
Wait for data to be sent out
Store data to be transmitted
Wait for data to be sent out
...

<pre>Store data to be transmitted
Wait for data to be sent out</pre>
</pre>
From a power consumption perspective, the question is - how best to wait for the data to be sent out? Well, you have four basic approaches - open loop, polling, interrupting or a hybrid combination.  In assessing them from a power consumption perspective, what I look at is how many non-useful clock cycles I have to execute in order to transmit a byte of data.
<h2>Open Loop</h2>
I use the term open loop to describe a technique whereby you make use of the properties of a synchronous link to know (actually more accurately presume) that it is safe to send the next byte. This technique is only of use when the transmit frequency is very high in comparison to the CPU speed. For example, consider an SPI link  between a CPU and a peripheral. In many cases, this link may be clocked  at up to half the CPU clock frequency. In which case it takes a mere 16  CPU clocks to shift out an 8 bit datum. As a result one can simply delay 16 clock cycles between writing successive bytes. The code looks something like this:
<pre>SBUF = datum[0];
delay(16 - LOAD_TIME);
SBUF = datum[1];
delay(16 - LOAD_TIME);
...
</pre>
LOAD_TIME is a constant that takes into account the number of cycles required to get the next datum from memory and write it to SBUF. Thus the number of non-useful clock cycles per byte is (16- LOAD-TIME).

Now most of you are probably thinking that I'm nuts for advocating this approach - and I'd tend to agree with you! It's a technique I've only used a few times - and then only when I had to get the data out with the least possible latency and with the least amount of power consumed. I much prefer the next technique which can be almost as efficient - but a lot safer.
<h2>Polling</h2>
Polling differs from the open loop approach in that one polls a status register to determine when it is safe to write the next byte. This can be quite power efficient as long as, just for the previous example,  the transmit speed is very high in comparison to the CPU speed. Thus the SPI link given in the open loop example is also a good candidate for this approach. The code looks something like this:
<pre>SBUF = datum[0];
wait_for_sbuf_empty();
SBUF = datum[1];
wait_for_sbuf_empty();
...
</pre>
The key to making this approach as efficient as possible is to code the wait function so that you read the status register on the first clock after you expect SBUF will become available.  In other words you still use a pre-calculated delay, but you throw in a check of the status register just to make sure before you load the next byte. By pre-fetching the next byte to be loaded and doing some other tweaking it's often possible to get this approach almost as efficient as the open loop method. Notwithstanding these optimizations, the number of non-useful polling clock cycles will be greater than the number of CPU clocks required to transmit the data.
<h2>Interrupting</h2>
When the transmit frequency starts to slow down with respect to the CPU frequency, then the number of non-useful clock cycles quickly starts to rise if one uses a polling method. The classic example of this is of course asynchronous serial links running at standard baud rates.  In these cases, the transmit time is a large fraction of a millisecond and a polling approach consumes a huge number of CPU cycles (and hence power). The solution here is of course to turn to an interrupt driven approach. In this case the over-head of the ISR is 'non-useful' clock cycles.  As I showed in this <a href="http://www.rmbconsulting.us/minimize-interrupt-service-routine-overhead">article</a> the overhead of even a simple looking ISR can be quite significant. Notwithstanding this, for asynchronous serial links, an interrupt based approach is nearly always the most efficient.
<h2>Hybrid</h2>
The final methodology is what I term the hybrid approach. It's typically the most power efficient and is well suited to medium to fast serial links. The code for it looks like this:
<pre>SBUF = datum[0];
__sleep();
SBUF = datum[1];
__sleep();
...

__interrupt void sbuf_tx_isr(void)
{
 /* Empty */
}
</pre>
In this approach, I enable the transmit interrupt, but have no code in the interrupt handler. After each write to SBUF I execute a sleep instruction, effectively stopping op code processing. Once SBUF has emptied, it generates an interrupt. The processor vectors to the empty ISR, returns immediately and then processes the next instruction which stores the next byte in SBUF. In this case the overhead is the number of clock cycles to enter and exit sleep mode, plus the number of cycles to vector to an ISR and return. Depending upon your processor architecture this can be anything from almost nothing to quite a lot. However it is always less than a full blown interrupt handler approach and is in my experience, often less than the polling or open loop methods.

Notwithstanding the above, this method has several weaknesses:
<ol>
	<li>It should be obvious that the only interrupt that can be enabled is the SBUF transmit interrupt. (Actually it's more accurate to say that the only interrupt that can cause the processor to exit sleep mode is the SBUF transmit interrupt. The MSP430, for example, allows one to do this).</li>
	<li>While I don't consider this a kludge, it's certainly not crystal clear what is going on. Thus clear documentation is a must.</li>
</ol>
<h2>Summary</h2>
<ol>
	<li>If you feel the need for the utmost efficiency then go open loop. It's a bit like drag-racing in that it's fast, furious and undoubtedly gets you from A to B ASAP. Just don't be surprised if you blow up in the process.</li>
	<li>If open-loop isn't for you then polling may make sense provided you can crank up the transmit speed high enough. This makes for the simplest code - and that's always a plus in my book.</li>
	<li>If you have an asynchronous link, then an interrupt based approach is the right answer 99% of the time.</li>
	<li>If you have a medium to high speed link, then the hybrid approach has much to commend it. Once you've seen it done a few times it becomes less weird looking.</li>
</ol>
<a href="http://embeddedgurus.com/stack-overflow/2009/11/lowering-power-consumption-tip-3-using-relays/">Previous Tip</a>]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/stack-overflow/2010/05/lowering-power-consumption-tip-4-transmitting-serial-data/feed/</wfw:commentRss>
			</item>
						<item>
				<title>It's Good To Be An Engineer - - Sort Of</title>
				<link>http://embeddedgurus.com/area-0x51/2010/05/its-good-to-be-an-engineer-sort-of/</link>
				<comments>http://embeddedgurus.com/area-0x51/2010/05/its-good-to-be-an-engineer-sort-of/#comments</comments>
				<pubDate>Fri, 14 May 2010 10:50:53 +0000</pubDate>
				<dc:creator>Mike Ficco</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/area-0x51/2010/05/its-good-to-be-an-engineer-sort-of/</guid>
                <description><![CDATA[About 14 months ago I bought a name brand laptop from a well-known retailer.  I declined the extended warranty so I should not have been surprised when the power jack inside the laptop cracked 63 days after the warranty expired.  The damage was disabling as the external plug had to be propped at a severe angle to charge the computer.  Unfortunately it would have to be repaired.

A quick call to the retailer told me I was in trouble.  They said the repair would require replacing the motherboard and that could only be done by the factory service center on the other side of the country.  Undaunted, I called the service center – more bad news.  The process was to send them the laptop and a $99.95 evaluation fee plus $20 shipping.  In a little while they would get back to me and tell me the actual cost of the repair.

Forget them!  I'm an engineer!  I can fix this thing myself!

After a couple of call transfers, pressing 1, 3, 1, 1, #, 3 – getting disconnected – calling more numbers and pressing more keys I was able to order the power connector for only $37 + $8 shipping.  I declined the $24 overnight delivery option, but the connector arrived the next day anyway.

I excitedly opened the package but was horrified to see the connector had an attached Frankenstein of a cable.  It was about 12 inches long with two right angle bends.  I said to myself, "This can't be good".

More than 30 screws later my table was covered with a mostly disassembled laptop.  The connector was not mounted on the motherboard but was attached to the frame.  The motherboard was still partially attached but I was able to pry it up enough to remove the old Frankenstein cable and route the new one.  I congratulated myself for being an engineer and began reassembling the computer.  A little while later I plugged in the charger.  As the battery charging light came on I thought, "That wasn't too bad".

But wait!  The computer would not turn on when I pressed the power button.  I disassembled the computer again but saw nothing wrong.  The power button worked fine, the battery was being charged, but no power was reaching the motherboard.  I reassembled the laptop hoping against hope that there was some interlock that required full assembly before power-up was allowed.  No such luck.  I walked away from the laptop to watch some late-night TV, but the problem wouldn't leave my head.  Did I crack the motherboard while lifting it to route the power cable?  Did I somehow damage an interlock?  Was something being shorted?

By the next morning I decided the most likely problem was that I inadvertently broke or disconnected a cable hidden under the motherboard.  The following weekend I again disassembled the laptop – this time completely.  There, underneath the motherboard, was a ribbon cable that had been pulled from its socket.  It was quire short and I had a bear of a time reconnecting it.  Eventually I got everything back together with some fresh thermal paste on the heat sink and fresh Loctite on the screws.  The laptop was finally back together and working.

So, what did I learn from this adventure?  One thing is that being an engineer makes you do stupid things.  No normal person would ever think about disassembling a laptop computer.  After this adventure I've become as smart as a normal person.  Working on this was such a pain in the butt that next time I'll just throw the laptop away.

The other thing I learned was that cheap assembly labor makes for really, really bad designs.  You have got to be kidding me.  This was very much a name brand computer but the inside was a plumber's nightmare.  Two more weeks with the CAD group would have reduced by half the 70 to 100+ screws that hold everything together.  Another week of CAD improvements and half of the 20 or 30 connectors would be gone.

I think one unexpected benefit of foreign assembly of products is that the design engineers don't understand the language.  They don't understand the cruel jokes the assemblers make about their convoluted mess of a product.

Look ashamed boys.  Look ashamed.  I've seen inside your product and you have every reason to be very ashamed.]]></description>
                <content:encoded><![CDATA[About 14 months ago I bought a name brand laptop from a well-known retailer.  I declined the extended warranty so I should not have been surprised when the power jack inside the laptop cracked 63 days after the warranty expired.  The damage was disabling as the external plug had to be propped at a severe angle to charge the computer.  Unfortunately it would have to be repaired.

A quick call to the retailer told me I was in trouble.  They said the repair would require replacing the motherboard and that could only be done by the factory service center on the other side of the country.  Undaunted, I called the service center – more bad news.  The process was to send them the laptop and a $99.95 evaluation fee plus $20 shipping.  In a little while they would get back to me and tell me the actual cost of the repair.

Forget them!  I'm an engineer!  I can fix this thing myself!

After a couple of call transfers, pressing 1, 3, 1, 1, #, 3 – getting disconnected – calling more numbers and pressing more keys I was able to order the power connector for only $37 + $8 shipping.  I declined the $24 overnight delivery option, but the connector arrived the next day anyway.

I excitedly opened the package but was horrified to see the connector had an attached Frankenstein of a cable.  It was about 12 inches long with two right angle bends.  I said to myself, "This can't be good".

More than 30 screws later my table was covered with a mostly disassembled laptop.  The connector was not mounted on the motherboard but was attached to the frame.  The motherboard was still partially attached but I was able to pry it up enough to remove the old Frankenstein cable and route the new one.  I congratulated myself for being an engineer and began reassembling the computer.  A little while later I plugged in the charger.  As the battery charging light came on I thought, "That wasn't too bad".

But wait!  The computer would not turn on when I pressed the power button.  I disassembled the computer again but saw nothing wrong.  The power button worked fine, the battery was being charged, but no power was reaching the motherboard.  I reassembled the laptop hoping against hope that there was some interlock that required full assembly before power-up was allowed.  No such luck.  I walked away from the laptop to watch some late-night TV, but the problem wouldn't leave my head.  Did I crack the motherboard while lifting it to route the power cable?  Did I somehow damage an interlock?  Was something being shorted?

By the next morning I decided the most likely problem was that I inadvertently broke or disconnected a cable hidden under the motherboard.  The following weekend I again disassembled the laptop – this time completely.  There, underneath the motherboard, was a ribbon cable that had been pulled from its socket.  It was quire short and I had a bear of a time reconnecting it.  Eventually I got everything back together with some fresh thermal paste on the heat sink and fresh Loctite on the screws.  The laptop was finally back together and working.

So, what did I learn from this adventure?  One thing is that being an engineer makes you do stupid things.  No normal person would ever think about disassembling a laptop computer.  After this adventure I've become as smart as a normal person.  Working on this was such a pain in the butt that next time I'll just throw the laptop away.

The other thing I learned was that cheap assembly labor makes for really, really bad designs.  You have got to be kidding me.  This was very much a name brand computer but the inside was a plumber's nightmare.  Two more weeks with the CAD group would have reduced by half the 70 to 100+ screws that hold everything together.  Another week of CAD improvements and half of the 20 or 30 connectors would be gone.

I think one unexpected benefit of foreign assembly of products is that the design engineers don't understand the language.  They don't understand the cruel jokes the assemblers make about their convoluted mess of a product.

Look ashamed boys.  Look ashamed.  I've seen inside your product and you have every reason to be very ashamed.]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/area-0x51/2010/05/its-good-to-be-an-engineer-sort-of/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Balancing How Firmware Waits on Hardware</title>
				<link>http://embeddedgurus.com/embedded-bridge/2010/05/balancing-how-firmware-waits-on-hardware/</link>
				<comments>http://embeddedgurus.com/embedded-bridge/2010/05/balancing-how-firmware-waits-on-hardware/#comments</comments>
				<pubDate>Sat, 08 May 2010 04:39:50 +0000</pubDate>
				<dc:creator>Gary Stringham</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/embedded-bridge/2010/05/balancing-how-firmware-waits-on-hardware/</guid>
                <description><![CDATA[A common question engineers often wrestle with is how long hardware will take to do a requested task so firmware can take the next step. Engineers implement different designs (both in hardware and firmware) depending on the length of time, and these designs have varying impacts on hardware and firmware complexity and overall system performance. Understanding their ramifications during the design phase helps balance the load between hardware and firmware.

Based on the hardware and firmware implementation required, we can group these designs into three categories:
<ul>
	<li><strong>No Delay</strong> – Hardware completes the task almost immediately. Firmware can assume the task is immediately completed and can safely take the next step.</li>
	<li><strong>Short Delay</strong> – Hardware completes the task after a short delay. Firmware must wait momentarily for the task to complete before taking the next step.</li>
	<li><strong>Long Delay</strong> – Hardware completes the task after a long delay. The wait time is long enough that firmware should do other processing while waiting for the task to complete before it can take the next step.</li>
</ul>
Let’s take aborts in hardware as an example, since implementations exist in each of the three categories – no, short, and long delays. For some aborts there is no delay; it is a simple matter of returning back to the home or idle state, clearing counters and buffers, and completing other activities that can be done quickly. Such an operation is so quick that it is not necessary for hardware to add extra logic for a status or interrupt bit. In these cases, firmware can initiate the abort and simply move on to the next step, which may be to set up the hardware for the next job. The key is for hardware to complete the abort before firmware tries to access it again.
<blockquote><strong>Best Practice:</strong> When the task in hardware is fast enough to complete before the next firmware access, hardware does not need to implement a status or interrupt bit for task completion.</blockquote>
Some abort implementations can take several clock cycles to complete, which means that firmware must wait for completion before accessing the block again. If it is a short delay, hardware should provide a status bit that firmware could poll, looping a few times until the task is done, then move on to the next step. If there is a long delay, then hardware should provide an interrupt bit that firmware will enable. Firmware will then do other processing while waiting for the interrupt to occur. Setting up, waiting and responding to an interrupt requires several CPU cycles with task swaps, context switches and semaphore handling. Thus, for firmware, polling a status bit is preferable to managing an interrupt if the task will be done after a short delay.

Where that line should be between short and long delays must be determined on a case-by-case basis and depends on the hardware platform, operating system and performance requirements. The dividing line could even move dynamically depending on the current operating conditions of the product. To give engineers the flexibility of moving that dividing line, the hardware for short and long delays should be the same, implemented with both a status bit and a maskable interrupt. This flexibility allows engineers to calculate or take measurements to count how many loops the polling is taking and determine if polling is acceptable or if interrupts are needed.
<blockquote><strong>Best Practice:</strong> Implement both a status bit and a maskable interrupt bit to indicate completion of hardware tasks that take time to complete, whether a short or a long time.</blockquote>
For some blocks, the time the abort takes can vary from a short delay if the block is in an idle state to a long delay if the block is busy and needs to gracefully terminate. Since firmware cannot know the current state, it must always assume the worse case. If firmware wants to take advantage of the shorter aborts when they do occur, it could poll for several loops in case the task completes quickly. If not, then enable the interrupt and switch to another task.

To help engineers know how to implement the firmware, put in the block’s documentation the min and max abort times and the conditions in which they will occur. It could be something such as, “if the block is already idle, the abort will complete in 20ns, otherwise it will take 2-3us to complete.”
<blockquote><strong>Best Practice:</strong> Document the min and max times that a hardware task will take, including the conditions and states that affect those times.</blockquote>
I used aborts for these examples, but the concepts apply for any firmware-initiated hardware task that could take time to complete. Implementing both status and interrupt bits for short- and long-delay hardware tasks allows firmware to balance the system load and performance by using polling loops or interrupts as appropriate.

Until the next interrupt (which will not occur for at least 1,000,000,000,000us)...]]></description>
                <content:encoded><![CDATA[A common question engineers often wrestle with is how long hardware will take to do a requested task so firmware can take the next step. Engineers implement different designs (both in hardware and firmware) depending on the length of time, and these designs have varying impacts on hardware and firmware complexity and overall system performance. Understanding their ramifications during the design phase helps balance the load between hardware and firmware.

Based on the hardware and firmware implementation required, we can group these designs into three categories:
<ul>
	<li><strong>No Delay</strong> – Hardware completes the task almost immediately. Firmware can assume the task is immediately completed and can safely take the next step.</li>
	<li><strong>Short Delay</strong> – Hardware completes the task after a short delay. Firmware must wait momentarily for the task to complete before taking the next step.</li>
	<li><strong>Long Delay</strong> – Hardware completes the task after a long delay. The wait time is long enough that firmware should do other processing while waiting for the task to complete before it can take the next step.</li>
</ul>
Let’s take aborts in hardware as an example, since implementations exist in each of the three categories – no, short, and long delays. For some aborts there is no delay; it is a simple matter of returning back to the home or idle state, clearing counters and buffers, and completing other activities that can be done quickly. Such an operation is so quick that it is not necessary for hardware to add extra logic for a status or interrupt bit. In these cases, firmware can initiate the abort and simply move on to the next step, which may be to set up the hardware for the next job. The key is for hardware to complete the abort before firmware tries to access it again.
<blockquote><strong>Best Practice:</strong> When the task in hardware is fast enough to complete before the next firmware access, hardware does not need to implement a status or interrupt bit for task completion.</blockquote>
Some abort implementations can take several clock cycles to complete, which means that firmware must wait for completion before accessing the block again. If it is a short delay, hardware should provide a status bit that firmware could poll, looping a few times until the task is done, then move on to the next step. If there is a long delay, then hardware should provide an interrupt bit that firmware will enable. Firmware will then do other processing while waiting for the interrupt to occur. Setting up, waiting and responding to an interrupt requires several CPU cycles with task swaps, context switches and semaphore handling. Thus, for firmware, polling a status bit is preferable to managing an interrupt if the task will be done after a short delay.

Where that line should be between short and long delays must be determined on a case-by-case basis and depends on the hardware platform, operating system and performance requirements. The dividing line could even move dynamically depending on the current operating conditions of the product. To give engineers the flexibility of moving that dividing line, the hardware for short and long delays should be the same, implemented with both a status bit and a maskable interrupt. This flexibility allows engineers to calculate or take measurements to count how many loops the polling is taking and determine if polling is acceptable or if interrupts are needed.
<blockquote><strong>Best Practice:</strong> Implement both a status bit and a maskable interrupt bit to indicate completion of hardware tasks that take time to complete, whether a short or a long time.</blockquote>
For some blocks, the time the abort takes can vary from a short delay if the block is in an idle state to a long delay if the block is busy and needs to gracefully terminate. Since firmware cannot know the current state, it must always assume the worse case. If firmware wants to take advantage of the shorter aborts when they do occur, it could poll for several loops in case the task completes quickly. If not, then enable the interrupt and switch to another task.

To help engineers know how to implement the firmware, put in the block’s documentation the min and max abort times and the conditions in which they will occur. It could be something such as, “if the block is already idle, the abort will complete in 20ns, otherwise it will take 2-3us to complete.”
<blockquote><strong>Best Practice:</strong> Document the min and max times that a hardware task will take, including the conditions and states that affect those times.</blockquote>
I used aborts for these examples, but the concepts apply for any firmware-initiated hardware task that could take time to complete. Implementing both status and interrupt bits for short- and long-delay hardware tasks allows firmware to balance the system load and performance by using polling loops or interrupts as appropriate.

Until the next interrupt (which will not occur for at least 1,000,000,000,000us)...]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/embedded-bridge/2010/05/balancing-how-firmware-waits-on-hardware/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Considerate coding</title>
				<link>http://embeddedgurus.com/stack-overflow/2010/05/considerate-coding/</link>
				<comments>http://embeddedgurus.com/stack-overflow/2010/05/considerate-coding/#comments</comments>
				<pubDate>Mon, 03 May 2010 12:08:19 +0000</pubDate>
				<dc:creator>Nigel Jones</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/stack-overflow/2010/05/considerate-coding/</guid>
                <description><![CDATA[One of my major recreational pursuits is bike riding. I live in a rural area with some great terrain, and more to the point a very low traffic density. Naturally on a 5 or 6 hour ride one does encounter some traffic and I'm always struck by the different degrees of consideration afforded to cyclists by motorists. Some are extremely solicitous and will wait so that they can pass you slowly and with a wide separation; others are complete jerks and will pass you as close and as fast as possible, often sounding their horn as they go by. Then there are the bulk of the drivers who will attempt to give you as much room as possible commensurate with slowing them down as little as possible. I was pondering this view of human nature yesterday while out on a ride, when it occurred to me that I see a similar range of consideration when it comes to embedded software. To see what I mean, read on ...

I've mentioned several times in this blog that the main purpose of source code is not to generate a binary image but rather to educate those that come after you (including possibly a future version of yourself). You may or may not subscribe to this belief. However once you realize that source code often has a life of decades, and that the same source code may end up in dozens of products, then perhaps you may start to change your mind. So with this said, I think I can make a number of observations.
<ol>
	<li>It may be obvious to you, the author of the code, what the intended compilation platform is - after all it's the one you are using. Alas it is not obvious to someone else who has been handed the source code and told to use it. ( I ran into this problem six months ago in which I had a vary large ARM project - but with no indication of which ARM compiler it was intended for).</li>
	<li>It may also be obvious to you what hardware platform the code is intended for - again it's the one you are working on.</li>
	<li>It may also be obvious to you that the way to build the various targets is to change to directory X and invoke command Y with parameter Z - after all you do it ten times a day.</li>
	<li>It may also be obvious to you that the 27 warnings produced during the final build are benign - as after all you have checked them out.</li>
</ol>
However none of the above is clear to someone 5 years from now!

Clearly the above is just a partial list of what I call implicit information about a project. That is information that is essential to being able to use the code base, but which is often omitted from the documentation by the author. It's my contention that the degree to which you explicitly provide this implicit information governs whether you are a jerk, a typical coder, or a considerate coder. Most of us (myself included) are typical coders (and I know this because I've seen a lot of code). If you want to make the move up to being a considerate coder, then here's a few things I suggest you do.
<ol>
	<li>Place all the implicit information in main.c. Why is this you ask? Well if I was to dump three hundred source files on you, which one would you look at first? (An acceptable alternative is to state in main.c that useful information may be found in file X. Be aware however that non obvious source files sometimes get stripped out of source code archives).</li>
	<li>Include in main as a minimum information about the compiler (including its version), the intended hardware target, and how to build the code.</li>
	<li>Think for a minute or two about all the other information you are implicitly using in writing the source code and building it - and take the time to include it in main.c. Typically this includes additional tools, scripts etc.</li>
	<li>For an excellent discourse on why leaving warnings in your code is downright inconsiderate, see this <a href="http://codereview.blogspot.com/2010/02/zero-tolerance.html">posting</a> from Alan Bowens.</li>
</ol>
If you do the above, then you are well on the way to becoming a 'considerate coder'. Will doing this get you a pay increase, or at least a pat on the back from the boss - probably not. However just like the person who slows down and passes cyclists with a wide berth, you can go home at night knowing you aren't a jerk. That has to be worth something.]]></description>
                <content:encoded><![CDATA[One of my major recreational pursuits is bike riding. I live in a rural area with some great terrain, and more to the point a very low traffic density. Naturally on a 5 or 6 hour ride one does encounter some traffic and I'm always struck by the different degrees of consideration afforded to cyclists by motorists. Some are extremely solicitous and will wait so that they can pass you slowly and with a wide separation; others are complete jerks and will pass you as close and as fast as possible, often sounding their horn as they go by. Then there are the bulk of the drivers who will attempt to give you as much room as possible commensurate with slowing them down as little as possible. I was pondering this view of human nature yesterday while out on a ride, when it occurred to me that I see a similar range of consideration when it comes to embedded software. To see what I mean, read on ...

I've mentioned several times in this blog that the main purpose of source code is not to generate a binary image but rather to educate those that come after you (including possibly a future version of yourself). You may or may not subscribe to this belief. However once you realize that source code often has a life of decades, and that the same source code may end up in dozens of products, then perhaps you may start to change your mind. So with this said, I think I can make a number of observations.
<ol>
	<li>It may be obvious to you, the author of the code, what the intended compilation platform is - after all it's the one you are using. Alas it is not obvious to someone else who has been handed the source code and told to use it. ( I ran into this problem six months ago in which I had a vary large ARM project - but with no indication of which ARM compiler it was intended for).</li>
	<li>It may also be obvious to you what hardware platform the code is intended for - again it's the one you are working on.</li>
	<li>It may also be obvious to you that the way to build the various targets is to change to directory X and invoke command Y with parameter Z - after all you do it ten times a day.</li>
	<li>It may also be obvious to you that the 27 warnings produced during the final build are benign - as after all you have checked them out.</li>
</ol>
However none of the above is clear to someone 5 years from now!

Clearly the above is just a partial list of what I call implicit information about a project. That is information that is essential to being able to use the code base, but which is often omitted from the documentation by the author. It's my contention that the degree to which you explicitly provide this implicit information governs whether you are a jerk, a typical coder, or a considerate coder. Most of us (myself included) are typical coders (and I know this because I've seen a lot of code). If you want to make the move up to being a considerate coder, then here's a few things I suggest you do.
<ol>
	<li>Place all the implicit information in main.c. Why is this you ask? Well if I was to dump three hundred source files on you, which one would you look at first? (An acceptable alternative is to state in main.c that useful information may be found in file X. Be aware however that non obvious source files sometimes get stripped out of source code archives).</li>
	<li>Include in main as a minimum information about the compiler (including its version), the intended hardware target, and how to build the code.</li>
	<li>Think for a minute or two about all the other information you are implicitly using in writing the source code and building it - and take the time to include it in main.c. Typically this includes additional tools, scripts etc.</li>
	<li>For an excellent discourse on why leaving warnings in your code is downright inconsiderate, see this <a href="http://codereview.blogspot.com/2010/02/zero-tolerance.html">posting</a> from Alan Bowens.</li>
</ol>
If you do the above, then you are well on the way to becoming a 'considerate coder'. Will doing this get you a pay increase, or at least a pat on the back from the boss - probably not. However just like the person who slows down and passes cyclists with a wide berth, you can go home at night knowing you aren't a jerk. That has to be worth something.]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/stack-overflow/2010/05/considerate-coding/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Butcher, Baker, Candlestick Maker</title>
				<link>http://embeddedgurus.com/area-0x51/2010/04/butcher-baker-candlestick-maker/</link>
				<comments>http://embeddedgurus.com/area-0x51/2010/04/butcher-baker-candlestick-maker/#comments</comments>
				<pubDate>Fri, 30 Apr 2010 11:24:25 +0000</pubDate>
				<dc:creator>Mike Ficco</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/area-0x51/2010/04/butcher-baker-candlestick-maker/</guid>
                <description><![CDATA[Doctors and Engineers are two of the most respected occupations in our society.  While both are highly educated and highly intelligent, in my opinion there are far more differences between the two careers than similarities.  For example, an engineer may be expected to start with a blank sheet of paper and design a complicated system, but a doctor starts with an incredibly complicated system (the human body) and tries to figure out why it's not working right.

Both groups can have substantial egos, but the working environment accommodates this very differently.  A doctor's self-esteem may be regularly bolstered as numerous patients view him as a savior, but in some companies engineers' self-esteem can be undermined every day by management attitudes and demands.

There are other significant differences in the working environment.  In the office of a private practitioner the word and will of the doctor is final.  Even in a large hospital doctors are rarely contradicted or overruled.  Doctors are mutually supportive and hesitate to criticize each other in public – perhaps for the purely practical reason of avoiding lawsuits.

In contrast, engineers rarely get to make visible decisions.  They can make dozens or hundreds of decision on which components to use and how to compose software.  However, the business and marketing groups make the decisions on visible issues like which bugs to fix, which features to implement, and when the product works well enough to ship.  In this sense, engineers are more like assembly line workers than doctors.  They are told what to do and often reminded to work harder.

With respect to safety and security, a rogue doctor can sicken or kill a few people by misdiagnosis or administration of inappropriate drugs.  Major damage is greatly limited by many safeguards and checks from insurance companies, federal and state authorities, and highly trained pharmacists who distribute the drugs.  In fact, there is an entire industry focused on, and making a great deal of money from, ensuring doctors act properly – or suing them if they don't.

One bad or malicious programmer, on the other hand, can endanger hundreds, thousands, or more.  They can take down a power grid, disrupt stop lights, or simply do a bad job implementing automotive anti-lock brakes.

It is interesting that engineers have so little comparative power and authority yet can cause so much harm and damage.  Indeed, inappropriate or malicious use of a compiler can be far more dangerous to a much larger portion of society than could ever be endangered by any doctor not working on germ warfare.  One doesn't even have to be a technical professional to acquire a compiler or similar tool.  Such tools give ANYONE the ability to hack into credit card records, banks, or the national defense network.  There are no checks and no safeguards on the usage of these tools.  The man on the street has unrestricted access to compilers and other highly dangerous software tools without demonstration of any need and without an authorization or "prescription" from a well-trained professional.

We have created a society that requires a doctor's prescription for a great amount of the truly effective medicine (see my blog "<a href="http://connect.mypath.com/mypath/blogs/engineering-success/2010/03/28/saving-real-money-on-healthcare">Saving REAL Money on Healthcare</a>") but allows any untrained and unsupervised clown – perhaps with a criminal record – to acquire potentially far more dangerous computer equipment and tools.

Why is it, in general, doctors are the kings and queens of their domain but engineers simply follow orders handed down by the business group?  Why is it that medical students must serve an apprenticeship (internship) before they are allowed to practice medicine but most young engineers receive ad hoc on-the-job training from the (perhaps qualified) guy in the next cubicle?  Why is it that we are so well conditioned by our culture that the social structure I described seems normal and acceptable?

Perhaps when we can answer these questions we, the engineers, will finally be worthy of being as highly regarded as medical doctors.  Until then our masters will continue to tell us the bugs to fix and when we will finish.]]></description>
                <content:encoded><![CDATA[Doctors and Engineers are two of the most respected occupations in our society.  While both are highly educated and highly intelligent, in my opinion there are far more differences between the two careers than similarities.  For example, an engineer may be expected to start with a blank sheet of paper and design a complicated system, but a doctor starts with an incredibly complicated system (the human body) and tries to figure out why it's not working right.

Both groups can have substantial egos, but the working environment accommodates this very differently.  A doctor's self-esteem may be regularly bolstered as numerous patients view him as a savior, but in some companies engineers' self-esteem can be undermined every day by management attitudes and demands.

There are other significant differences in the working environment.  In the office of a private practitioner the word and will of the doctor is final.  Even in a large hospital doctors are rarely contradicted or overruled.  Doctors are mutually supportive and hesitate to criticize each other in public – perhaps for the purely practical reason of avoiding lawsuits.

In contrast, engineers rarely get to make visible decisions.  They can make dozens or hundreds of decision on which components to use and how to compose software.  However, the business and marketing groups make the decisions on visible issues like which bugs to fix, which features to implement, and when the product works well enough to ship.  In this sense, engineers are more like assembly line workers than doctors.  They are told what to do and often reminded to work harder.

With respect to safety and security, a rogue doctor can sicken or kill a few people by misdiagnosis or administration of inappropriate drugs.  Major damage is greatly limited by many safeguards and checks from insurance companies, federal and state authorities, and highly trained pharmacists who distribute the drugs.  In fact, there is an entire industry focused on, and making a great deal of money from, ensuring doctors act properly – or suing them if they don't.

One bad or malicious programmer, on the other hand, can endanger hundreds, thousands, or more.  They can take down a power grid, disrupt stop lights, or simply do a bad job implementing automotive anti-lock brakes.

It is interesting that engineers have so little comparative power and authority yet can cause so much harm and damage.  Indeed, inappropriate or malicious use of a compiler can be far more dangerous to a much larger portion of society than could ever be endangered by any doctor not working on germ warfare.  One doesn't even have to be a technical professional to acquire a compiler or similar tool.  Such tools give ANYONE the ability to hack into credit card records, banks, or the national defense network.  There are no checks and no safeguards on the usage of these tools.  The man on the street has unrestricted access to compilers and other highly dangerous software tools without demonstration of any need and without an authorization or "prescription" from a well-trained professional.

We have created a society that requires a doctor's prescription for a great amount of the truly effective medicine (see my blog "<a href="http://connect.mypath.com/mypath/blogs/engineering-success/2010/03/28/saving-real-money-on-healthcare">Saving REAL Money on Healthcare</a>") but allows any untrained and unsupervised clown – perhaps with a criminal record – to acquire potentially far more dangerous computer equipment and tools.

Why is it, in general, doctors are the kings and queens of their domain but engineers simply follow orders handed down by the business group?  Why is it that medical students must serve an apprenticeship (internship) before they are allowed to practice medicine but most young engineers receive ad hoc on-the-job training from the (perhaps qualified) guy in the next cubicle?  Why is it that we are so well conditioned by our culture that the social structure I described seems normal and acceptable?

Perhaps when we can answer these questions we, the engineers, will finally be worthy of being as highly regarded as medical doctors.  Until then our masters will continue to tell us the bugs to fix and when we will finish.]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/area-0x51/2010/04/butcher-baker-candlestick-maker/feed/</wfw:commentRss>
			</item>
						<item>
				<title>RTOS without blocking?</title>
				<link>http://embeddedgurus.com/state-space/2010/04/rtos-without-blocking/</link>
				<comments>http://embeddedgurus.com/state-space/2010/04/rtos-without-blocking/#comments</comments>
				<pubDate>Mon, 19 Apr 2010 18:13:55 +0000</pubDate>
				<dc:creator>Miro Samek</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/state-space/2010/04/rtos-without-blocking/</guid>
                <description><![CDATA[In my previous post, "I hate RTOSes", I have identified blocking as the main cause of the particular brittleness and inflexibility of the programs based on RTOSes. Here I'd like to discuss techniques of minimizing blocking and eradicating it completely from the application-level code. In other words, I'd like to show you how to use an RTOS for building responsive event-driven software.

For reasons I've outlined before, experienced RTOS users have learned to be weary of peppering the code with the blocking calls to the RTOS. So, even though every RTOS boasts a plethora of various communication and synchronization mechanisms (all of them based on blocking), advanced real-time developers intentionally limit their designs to just one generic blocking call per task, as shown in the following pseudocode:
<pre>void task_routine(void *arg) {
    while (1) {
        // block on any event designated for this task (generic)
        // process the event *without* further blocking (task specific)
    }
}
</pre>
Most RTOSes provide mechanisms to wait for <strong>multiple</strong> events in a single blocking call, for example: event flags, message mailboxes, message queues, the <code>select()</code> call, condition variables, and many others. From all these possibilities, I'd like to single out the message queue, because it is the most generic and flexible mechanism. A message posted to a message queue not only unblocks any task that waits on the queue (synchronization), but the message can also contain any information associated with the event (interprocess communication). For example, a message from an analog-to-digital converter (ADC) can signal when the conversion has completed as well as the actual value of the conversion result.

The generic pseudocode of a task based on a message queue looks as follows:
<pre>void task_routine(void *arg) {
    while (1) { // main event loop of the task
        void *event = msg_queue_get(); // wait for event
        // process the event *without* further blocking (task specific)
    }
}
</pre>
The most important premise of this <strong>event-loop</strong> design is that the task-specific code that processes the events obtained from the queue is <strong>not</strong> allowed to block. The event-processing code must execute quickly and return back to the event loop, so that the event loop can check for other events.

This design also automatically guarantees that each event is processed in <strong>run-to-completion</strong> (RTC) fashion. By design, the event loop must necessarily complete processing of the current event before looping back to obtain and process the next event. Also note that the need for queuing events is an immediate consequence of the RTC processing style. Queuing prevents losing events that arrive while the event-loop is executing an RTC step.

The event-loop pseudocode shown above is still task-specific, but it is quite easy to make it completely generic. As shown below, you can combine a message queue and an event-handler pointer-to-function in the TCB structure. A pointer to the TCB struct can be then passed to the task in the argument of the task routine (<code>arg</code>). This is quite easily achieved when the task is created.
<pre>typedef struct {
    MessageQueue queue;        // event queue associated with the task
    void (*handler)(void *event); // event handler pointer-to-function
} TCB;   // task control block

void task_routine(void *arg) {
    while (1) { // main event loop of the task
        void *event = msg_queue_get(((TCB *)arg)-&gt;queue); // wait for event
        (*((TCB *)arg)-&gt;handler)(event);// handle the event without blocking
    }
}
</pre>
The last snippet of code is generic, meaning that this simple event-loop can be used for <strong>all</strong> tasks in you application. So at this point, you can consider the <code>task_routine()</code> function as part of the generic event-driven infrastructure for executing your applications, which consist of event-handler functions.

What this way of thinking gives you is quite significant, because in fact you have just created your first event-driven <strong>framework</strong>.

The distinction between a framework and a toolkit is simple. A toolkit, such as an RTOS, is essentially a collection of functions that <strong>you</strong> can call. When you use a toolkit, <strong>you</strong> write the main body of the application (such as all the task routines) and <strong>you</strong> call the various functions from the RTOS. When you use a framework, you reuse the main body (such as the <code>task_routine()</code> function) and you provide the code that <strong>the framework</strong> calls. In other words, a framework uses <strong>inverted</strong> control compared to a traditional RTOS.

Inversion of control is a very common phenomenon in all event-driven architectures, because it recolonizes the plain fact that the events are controlling the application, not the other way around.

In my next post in the "I hate RTOSes" series, I'll talk about challenges of programming without blocking. I'll explain what you need to sacrifice when you write non-blocking code and why this often leads to "spaghetti" code. Stay tuned!]]></description>
                <content:encoded><![CDATA[In my previous post, "I hate RTOSes", I have identified blocking as the main cause of the particular brittleness and inflexibility of the programs based on RTOSes. Here I'd like to discuss techniques of minimizing blocking and eradicating it completely from the application-level code. In other words, I'd like to show you how to use an RTOS for building responsive event-driven software.

For reasons I've outlined before, experienced RTOS users have learned to be weary of peppering the code with the blocking calls to the RTOS. So, even though every RTOS boasts a plethora of various communication and synchronization mechanisms (all of them based on blocking), advanced real-time developers intentionally limit their designs to just one generic blocking call per task, as shown in the following pseudocode:
<pre>void task_routine(void *arg) {
    while (1) {
        // block on any event designated for this task (generic)
        // process the event *without* further blocking (task specific)
    }
}
</pre>
Most RTOSes provide mechanisms to wait for <strong>multiple</strong> events in a single blocking call, for example: event flags, message mailboxes, message queues, the <code>select()</code> call, condition variables, and many others. From all these possibilities, I'd like to single out the message queue, because it is the most generic and flexible mechanism. A message posted to a message queue not only unblocks any task that waits on the queue (synchronization), but the message can also contain any information associated with the event (interprocess communication). For example, a message from an analog-to-digital converter (ADC) can signal when the conversion has completed as well as the actual value of the conversion result.

The generic pseudocode of a task based on a message queue looks as follows:
<pre>void task_routine(void *arg) {
    while (1) { // main event loop of the task
        void *event = msg_queue_get(); // wait for event
        // process the event *without* further blocking (task specific)
    }
}
</pre>
The most important premise of this <strong>event-loop</strong> design is that the task-specific code that processes the events obtained from the queue is <strong>not</strong> allowed to block. The event-processing code must execute quickly and return back to the event loop, so that the event loop can check for other events.

This design also automatically guarantees that each event is processed in <strong>run-to-completion</strong> (RTC) fashion. By design, the event loop must necessarily complete processing of the current event before looping back to obtain and process the next event. Also note that the need for queuing events is an immediate consequence of the RTC processing style. Queuing prevents losing events that arrive while the event-loop is executing an RTC step.

The event-loop pseudocode shown above is still task-specific, but it is quite easy to make it completely generic. As shown below, you can combine a message queue and an event-handler pointer-to-function in the TCB structure. A pointer to the TCB struct can be then passed to the task in the argument of the task routine (<code>arg</code>). This is quite easily achieved when the task is created.
<pre>typedef struct {
    MessageQueue queue;        // event queue associated with the task
    void (*handler)(void *event); // event handler pointer-to-function
} TCB;   // task control block

void task_routine(void *arg) {
    while (1) { // main event loop of the task
        void *event = msg_queue_get(((TCB *)arg)-&gt;queue); // wait for event
        (*((TCB *)arg)-&gt;handler)(event);// handle the event without blocking
    }
}
</pre>
The last snippet of code is generic, meaning that this simple event-loop can be used for <strong>all</strong> tasks in you application. So at this point, you can consider the <code>task_routine()</code> function as part of the generic event-driven infrastructure for executing your applications, which consist of event-handler functions.

What this way of thinking gives you is quite significant, because in fact you have just created your first event-driven <strong>framework</strong>.

The distinction between a framework and a toolkit is simple. A toolkit, such as an RTOS, is essentially a collection of functions that <strong>you</strong> can call. When you use a toolkit, <strong>you</strong> write the main body of the application (such as all the task routines) and <strong>you</strong> call the various functions from the RTOS. When you use a framework, you reuse the main body (such as the <code>task_routine()</code> function) and you provide the code that <strong>the framework</strong> calls. In other words, a framework uses <strong>inverted</strong> control compared to a traditional RTOS.

Inversion of control is a very common phenomenon in all event-driven architectures, because it recolonizes the plain fact that the events are controlling the application, not the other way around.

In my next post in the "I hate RTOSes" series, I'll talk about challenges of programming without blocking. I'll explain what you need to sacrifice when you write non-blocking code and why this often leads to "spaghetti" code. Stay tuned!]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/state-space/2010/04/rtos-without-blocking/feed/</wfw:commentRss>
			</item>
						<item>
				<title>I hate RTOSes</title>
				<link>http://embeddedgurus.com/state-space/2010/04/i-hate-rtoses/</link>
				<comments>http://embeddedgurus.com/state-space/2010/04/i-hate-rtoses/#comments</comments>
				<pubDate>Mon, 12 Apr 2010 18:22:16 +0000</pubDate>
				<dc:creator>Miro Samek</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/state-space/2010/04/i-hate-rtoses/</guid>
                <description><![CDATA[I have to confess that I've been experiencing a severe writer's block lately. It's not that I'm short of subjects to talk about, but I'm getting tired of circling around the most important issues that matter to me most and should matter the most to any embedded software developer. I mean the basic software structure.

Unfortunately, I find it impossible to talk about truly important issues without stepping on somebody's toes, which means picking a fight. So, in this installment I decided to come out of the closet and say it openly: I hate RTOSes.

The main reason I say so is because a conventional RTOS implies a certain programming paradigm, which leads to particularly brittle designs. I'm talking about blocking. Blocking occurs any time you wait explicitly in-line for something to happen. All RTOSes provide an assortment of blocking mechanisms, such as various semaphores, event-flags, mailboxes, message queues, and so on. Every RTOS task, structured as an endless loop, must use at least one such blocking mechanism, or else it will take all the CPU cycles. Typically, however, tasks block in many places scattered throughout various functions called from the task routine (the endless loop). For example, a task can block and wait for a semaphore that indicates end of an ADC conversion. In other part of the code, the same task might wait for a timeout event flag, and so on.

Blocking is evil, because it appears to work initially, but quickly degenerates into a unmanageable mess. The problem is that while a task is blocked, the task is not doing any other work and is not responsive to other events. Such task cannot be easily extended to handle other events, not just because the system is unresponsive, but also due to the fact the the whole structure of the code past the blocking call is designed to handle only the event that it was explicitly waiting for.

You might think that difficulty of adding new features (events and behaviors) to such designs is only important later, when the original software is maintained or reused for the next similar project. I disagree. Flexibility is vital from day one. Any application of nontrivial complexity is developed over time by gradually adding new events and behaviors. The inflexibility prevents an application to grow that way, so the design degenerates in the process known as architectural decay. This in turn makes it often impossible to even finish the original application, let alone maintain it.

The mechanisms of architectural decay of RTOS-based applications are manifold, but perhaps the worst is unnecessary proliferation of tasks. Designers, unable to add new events to unresponsive tasks are forced to create new tasks, regardless of coupling and cohesion. Often the new feature uses the same data as other feature in another tasks (we call such features cohesive). But placing the new feature in a different task requires very careful sharing of the common data. So mutexes and other such mechanisms must be applied. The designer ends up spending most of the time not on the feature at hand, but on managing subtle, hairy, unintended side-effects.

For decades embedded engineers were taught to believe that the only two alternatives for structuring embedded software are a "superloop" (main+ISRs) or an RTOS. But this is of course not true. Other alternatives exist, specifically event-driven programming with modern state machines is a much better way. It is not a silver bullet, of course, but after having used this method extensively for over a decade I will never go back to a raw RTOS. I plan to write more about this better way, why it is better and where it is still weak. Stay tuned.]]></description>
                <content:encoded><![CDATA[I have to confess that I've been experiencing a severe writer's block lately. It's not that I'm short of subjects to talk about, but I'm getting tired of circling around the most important issues that matter to me most and should matter the most to any embedded software developer. I mean the basic software structure.

Unfortunately, I find it impossible to talk about truly important issues without stepping on somebody's toes, which means picking a fight. So, in this installment I decided to come out of the closet and say it openly: I hate RTOSes.

The main reason I say so is because a conventional RTOS implies a certain programming paradigm, which leads to particularly brittle designs. I'm talking about blocking. Blocking occurs any time you wait explicitly in-line for something to happen. All RTOSes provide an assortment of blocking mechanisms, such as various semaphores, event-flags, mailboxes, message queues, and so on. Every RTOS task, structured as an endless loop, must use at least one such blocking mechanism, or else it will take all the CPU cycles. Typically, however, tasks block in many places scattered throughout various functions called from the task routine (the endless loop). For example, a task can block and wait for a semaphore that indicates end of an ADC conversion. In other part of the code, the same task might wait for a timeout event flag, and so on.

Blocking is evil, because it appears to work initially, but quickly degenerates into a unmanageable mess. The problem is that while a task is blocked, the task is not doing any other work and is not responsive to other events. Such task cannot be easily extended to handle other events, not just because the system is unresponsive, but also due to the fact the the whole structure of the code past the blocking call is designed to handle only the event that it was explicitly waiting for.

You might think that difficulty of adding new features (events and behaviors) to such designs is only important later, when the original software is maintained or reused for the next similar project. I disagree. Flexibility is vital from day one. Any application of nontrivial complexity is developed over time by gradually adding new events and behaviors. The inflexibility prevents an application to grow that way, so the design degenerates in the process known as architectural decay. This in turn makes it often impossible to even finish the original application, let alone maintain it.

The mechanisms of architectural decay of RTOS-based applications are manifold, but perhaps the worst is unnecessary proliferation of tasks. Designers, unable to add new events to unresponsive tasks are forced to create new tasks, regardless of coupling and cohesion. Often the new feature uses the same data as other feature in another tasks (we call such features cohesive). But placing the new feature in a different task requires very careful sharing of the common data. So mutexes and other such mechanisms must be applied. The designer ends up spending most of the time not on the feature at hand, but on managing subtle, hairy, unintended side-effects.

For decades embedded engineers were taught to believe that the only two alternatives for structuring embedded software are a "superloop" (main+ISRs) or an RTOS. But this is of course not true. Other alternatives exist, specifically event-driven programming with modern state machines is a much better way. It is not a silver bullet, of course, but after having used this method extensively for over a decade I will never go back to a raw RTOS. I plan to write more about this better way, why it is better and where it is still weak. Stay tuned.]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/state-space/2010/04/i-hate-rtoses/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Efficient C Tip #12 - Be wary of switch statements</title>
				<link>http://embeddedgurus.com/stack-overflow/2010/04/efficient-c-tip-12-be-wary-of-switch-statements/</link>
				<comments>http://embeddedgurus.com/stack-overflow/2010/04/efficient-c-tip-12-be-wary-of-switch-statements/#comments</comments>
				<pubDate>Sat, 10 Apr 2010 14:36:44 +0000</pubDate>
				<dc:creator>Nigel Jones</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/stack-overflow/2010/04/efficient-c-tip-12-be-wary-of-switch-statements/</guid>
                <description><![CDATA[This is the twelfth in a <a href="http://embeddedgurus.com/stack-overflow/2008/06/efficient-c-tips-1-choosing-the-correct-integer-size/">series</a> of tips on writing efficient C for embedded systems. Like the <a href="http://embeddedgurus.com/stack-overflow/2010/02/efficient-c-tip-11-avoid-passing-parameters-by-using-more-small-functions/">previous topic</a>, I suspect that this will be a bit controversial. As the title suggests, if you are interested in writing efficient C, you need to be wary of switch statements. Before I explain why, a little background will be useful. I did <strong>all</strong> of my early embedded systems programming in assembly language. This wasn't out of some sense of machismo, it was simply a reflection of the fact that there were no high level languages available (with the possible exception of <a href="http://en.wikipedia.org/wiki/PL/M">PL/M</a>). Naturally as well as programming embedded systems I also did computer programming, initially in Pascal and BASIC, and later in C. One of the major differences I found in using the HLL was the wonderful switch / case statement. I found it to be a beautiful tool - with a few lines of source code I could do all sorts of powerful things that were simply very difficult or tedious to do in assembly language. Fast forward a number of years and C compilers began to become available for small embedded systems and so I naturally started using them, together with of course the attendant switch statement. All was well in paradise until the day I used a switch statement in an interrupt service routine and found to my horror that the ISR was taking about ten times longer to execute than I thought was reasonable.

This precipitated an investigation into how exactly switch statements are implemented by the compiler. When I did this, I discovered a number of things that should give one pause.
<h3>Heuristic Algorithms</h3>
The first thing I discovered is that compilers typically have a number of ways of implementing a switch statement. They seem to be loosely divided into the following trichotomy:
<ol>
	<li>An if-else-if-else-if chain. In this implementation, the switch statement is treated as syntactic sugar for an if-else-if chain.</li>
	<li>Some form of jump or control tables, or as they are sometimes called a computed goto. This is a favorite technique of assembly language programmers and the compiler writers can use it to great effect.</li>
	<li>A hybrid of 1 &amp; 2.</li>
</ol>
Where it gets interesting is how the compiler decides which approach to use. If the case values are contiguous (e.g. zero through ten), then it's likely the compiler will use some form of jump table. Conversely if the case values are completely disjointed (e.g. zero, six, twenty, four hundred and a thousand) then an if-else implementation is likely. However what does the compiler do when, for example, you have a bifurcated set of ranges such as zero-ten and ninety - one hundred? Well the answer is, that each compiler seems to have some form of heuristic algorithm for determining what is the 'best' way of implementing a given set of cases. Although some compilers allow you to force a particular implementation, for the most part you are at the mercy of the compiler.
<h3>Comparative Execution Speeds</h3>
If you think about it, it should become apparent that a jump table approach is likely to give a highly consistent time of execution through the decision tree, whereas the if-else -if chain has a highly variable time of execution depending upon the particular value of the switched variable.  Notwithstanding this, the jump table approach has a certain amount of execution overhead associated with it. This means that although its  mean execution time (which is normally the same as its worst and best execution time) may be dramatically better than the mean execution time of the if-else-if chain, the if-else-if chain's <strong>best </strong>execution time may be considerably better. So what you say! Well in some cases, a particular value is far more likely to occur than the other values, thus it would be very nice if this value was tested first. However, as you will now see, this isn't guaranteed...
<h3>Order of Execution</h3>
For many years I wrote switch statements under the assumption that the case values would be evaluated from top to bottom. That is, if the compiler chose to implement the switch statement as an if-else-if chain, then it would first test the first case, then the second case and so on down to the default case at the bottom of my source code. Well it turns out that my assumption was completely wrong. The compiler is under no such obligation, and indeed will often evaluate the values bottom to top. Furthermore, the compiler will often evaluate the <em>default </em>value first. For example consider a defaulted switch statement with contiguous case values in the range zero to ten. If the index variable is an unsigned int, then there are at least 65525 possible values handled by the default case, and so it makes sense to eliminate them first. Now if you know that the index variable can only possibly take on the values zero to ten, then you can of course eliminate the default statement - and then get excoriated by the coding standards / MISRA folks.
<h3>Maintenance</h3>
This is the area where I really get worried. Consider the case where you have a switch statement in an ISR. The code is working with no problems until one day it is necessary to make a change to the switch statement - by for example adding an additional case value. This simple change can cause the compiler to completely change the implementation of the switch statement. As a result, you may find that:
<ol>
	<li>The worst case execution time has jumped dramatically.</li>
	<li>The mean execution time has jumped dramatically.</li>
	<li>The stack space required by the ISR has jumped dramatically.</li>
</ol>
Any of these three possibilities can cause your program to fail catastrophically. Now of course one could argue 'that's why you test all changes'. However, in my opinion it's far better to be proactive and to avoid putting yourself in this situation in the first place.

I'd also be remiss in not noting the dreaded missing break statement maintenance problem. However as a religious user of Lint, I'm not normally too concerned about this.
<h3>Switch statement alternatives</h3>
If performance and stability is your goal then I strongly recommend that you implement your code, the way you want it executed. This means either explicitly use an if-else-if chain or use function pointers. If function pointers scare you, then you might want to read this <a href="http://www.rmbconsulting.us/Publications/jump-Tables.pdf">article </a>I wrote on the subject.
<h3>Recommendations</h3>
Based on my experience, I have a number of things that I do when it comes to switch statements. If you find my analysis compelling, you may want to adopt them:
<ol>
	<li>Switch statements should be the last language construct you reach for - and not the first.</li>
	<li>Learn how to use function pointers. Once you do you'll find a lot of the reasons for using switch statements go away.</li>
	<li>Try to keep all the case values contiguous.</li>
	<li>If you can't keep the case values contiguous, go to the other extreme and make them disparate - that way you are less likely to have the compiler change the algorithm on you.</li>
	<li>If your compiler supports it, consider using pragmas to lock in a particular implementation.</li>
	<li>Be very wary of using switch statements in interrupt service routines or any other performance critical code.</li>
	<li>Use Lint to guard against missing break statements.</li>
</ol>
Comments welcome!

<a href="http://embeddedgurus.com/stack-overflow/2010/02/efficient-c-tip-11-avoid-passing-parameters-by-using-more-small-functions/">Previous Tip</a>

<a href="http://embeddedgurus.com/stack-overflow/">Home</a>]]></description>
                <content:encoded><![CDATA[This is the twelfth in a <a href="http://embeddedgurus.com/stack-overflow/2008/06/efficient-c-tips-1-choosing-the-correct-integer-size/">series</a> of tips on writing efficient C for embedded systems. Like the <a href="http://embeddedgurus.com/stack-overflow/2010/02/efficient-c-tip-11-avoid-passing-parameters-by-using-more-small-functions/">previous topic</a>, I suspect that this will be a bit controversial. As the title suggests, if you are interested in writing efficient C, you need to be wary of switch statements. Before I explain why, a little background will be useful. I did <strong>all</strong> of my early embedded systems programming in assembly language. This wasn't out of some sense of machismo, it was simply a reflection of the fact that there were no high level languages available (with the possible exception of <a href="http://en.wikipedia.org/wiki/PL/M">PL/M</a>). Naturally as well as programming embedded systems I also did computer programming, initially in Pascal and BASIC, and later in C. One of the major differences I found in using the HLL was the wonderful switch / case statement. I found it to be a beautiful tool - with a few lines of source code I could do all sorts of powerful things that were simply very difficult or tedious to do in assembly language. Fast forward a number of years and C compilers began to become available for small embedded systems and so I naturally started using them, together with of course the attendant switch statement. All was well in paradise until the day I used a switch statement in an interrupt service routine and found to my horror that the ISR was taking about ten times longer to execute than I thought was reasonable.

This precipitated an investigation into how exactly switch statements are implemented by the compiler. When I did this, I discovered a number of things that should give one pause.
<h3>Heuristic Algorithms</h3>
The first thing I discovered is that compilers typically have a number of ways of implementing a switch statement. They seem to be loosely divided into the following trichotomy:
<ol>
	<li>An if-else-if-else-if chain. In this implementation, the switch statement is treated as syntactic sugar for an if-else-if chain.</li>
	<li>Some form of jump or control tables, or as they are sometimes called a computed goto. This is a favorite technique of assembly language programmers and the compiler writers can use it to great effect.</li>
	<li>A hybrid of 1 &amp; 2.</li>
</ol>
Where it gets interesting is how the compiler decides which approach to use. If the case values are contiguous (e.g. zero through ten), then it's likely the compiler will use some form of jump table. Conversely if the case values are completely disjointed (e.g. zero, six, twenty, four hundred and a thousand) then an if-else implementation is likely. However what does the compiler do when, for example, you have a bifurcated set of ranges such as zero-ten and ninety - one hundred? Well the answer is, that each compiler seems to have some form of heuristic algorithm for determining what is the 'best' way of implementing a given set of cases. Although some compilers allow you to force a particular implementation, for the most part you are at the mercy of the compiler.
<h3>Comparative Execution Speeds</h3>
If you think about it, it should become apparent that a jump table approach is likely to give a highly consistent time of execution through the decision tree, whereas the if-else -if chain has a highly variable time of execution depending upon the particular value of the switched variable.  Notwithstanding this, the jump table approach has a certain amount of execution overhead associated with it. This means that although its  mean execution time (which is normally the same as its worst and best execution time) may be dramatically better than the mean execution time of the if-else-if chain, the if-else-if chain's <strong>best </strong>execution time may be considerably better. So what you say! Well in some cases, a particular value is far more likely to occur than the other values, thus it would be very nice if this value was tested first. However, as you will now see, this isn't guaranteed...
<h3>Order of Execution</h3>
For many years I wrote switch statements under the assumption that the case values would be evaluated from top to bottom. That is, if the compiler chose to implement the switch statement as an if-else-if chain, then it would first test the first case, then the second case and so on down to the default case at the bottom of my source code. Well it turns out that my assumption was completely wrong. The compiler is under no such obligation, and indeed will often evaluate the values bottom to top. Furthermore, the compiler will often evaluate the <em>default </em>value first. For example consider a defaulted switch statement with contiguous case values in the range zero to ten. If the index variable is an unsigned int, then there are at least 65525 possible values handled by the default case, and so it makes sense to eliminate them first. Now if you know that the index variable can only possibly take on the values zero to ten, then you can of course eliminate the default statement - and then get excoriated by the coding standards / MISRA folks.
<h3>Maintenance</h3>
This is the area where I really get worried. Consider the case where you have a switch statement in an ISR. The code is working with no problems until one day it is necessary to make a change to the switch statement - by for example adding an additional case value. This simple change can cause the compiler to completely change the implementation of the switch statement. As a result, you may find that:
<ol>
	<li>The worst case execution time has jumped dramatically.</li>
	<li>The mean execution time has jumped dramatically.</li>
	<li>The stack space required by the ISR has jumped dramatically.</li>
</ol>
Any of these three possibilities can cause your program to fail catastrophically. Now of course one could argue 'that's why you test all changes'. However, in my opinion it's far better to be proactive and to avoid putting yourself in this situation in the first place.

I'd also be remiss in not noting the dreaded missing break statement maintenance problem. However as a religious user of Lint, I'm not normally too concerned about this.
<h3>Switch statement alternatives</h3>
If performance and stability is your goal then I strongly recommend that you implement your code, the way you want it executed. This means either explicitly use an if-else-if chain or use function pointers. If function pointers scare you, then you might want to read this <a href="http://www.rmbconsulting.us/Publications/jump-Tables.pdf">article </a>I wrote on the subject.
<h3>Recommendations</h3>
Based on my experience, I have a number of things that I do when it comes to switch statements. If you find my analysis compelling, you may want to adopt them:
<ol>
	<li>Switch statements should be the last language construct you reach for - and not the first.</li>
	<li>Learn how to use function pointers. Once you do you'll find a lot of the reasons for using switch statements go away.</li>
	<li>Try to keep all the case values contiguous.</li>
	<li>If you can't keep the case values contiguous, go to the other extreme and make them disparate - that way you are less likely to have the compiler change the algorithm on you.</li>
	<li>If your compiler supports it, consider using pragmas to lock in a particular implementation.</li>
	<li>Be very wary of using switch statements in interrupt service routines or any other performance critical code.</li>
	<li>Use Lint to guard against missing break statements.</li>
</ol>
Comments welcome!

<a href="http://embeddedgurus.com/stack-overflow/2010/02/efficient-c-tip-11-avoid-passing-parameters-by-using-more-small-functions/">Previous Tip</a>

<a href="http://embeddedgurus.com/stack-overflow/">Home</a>]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/stack-overflow/2010/04/efficient-c-tip-12-be-wary-of-switch-statements/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Different Bit Types in Different Registers</title>
				<link>http://embeddedgurus.com/embedded-bridge/2010/03/different-bit-types-in-different-registers/</link>
				<comments>http://embeddedgurus.com/embedded-bridge/2010/03/different-bit-types-in-different-registers/#comments</comments>
				<pubDate>Tue, 30 Mar 2010 06:52:08 +0000</pubDate>
				<dc:creator>Gary Stringham</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/embedded-bridge/2010/03/different-bit-types-in-different-registers/</guid>
                <description><![CDATA[Several years ago I came across a register that, at first glance, seemed to be a typical register with several read/write bits. Upon closer inspection, I saw that one bit behaved differently, depending on its state and whether I was writing a 1 or a 0 to it. Since firmware had to handle that one bit differently from the other bits in that register, I saw that it would be difficult for firmware to safely handle both types in the same register. I was able to reduce the risk of mishandling that register by writing macros with generous comments. However, I could not ensure that everybody in the future would know that it was not a typical read/write register and treat it as such.

Although the engineer had valid reasons for designing the register that way, he or she had not anticipated the impact on firmware of mixing different types of bits in the same register. To avoid complexity and risk of firmware defects, different types of bits should be located in different registers. To see why, let’s examine how firmware manages two types of bits – read/write bits and interrupt bits.

With read/write bits, firmware sets and clears bits when needed. It typically first reads the register, modifies the desired bit, then writes the modified value back out. Here is a sample code fragment:
<blockquote>
<pre>tmp = ReadReg (regA);  /* Get the register contents */
tmp |= 0x01;           /* Set bit 0 */
RegWrite (regA, tmp);  /* Write it back out */</pre>
</blockquote>
In the case of interrupt bits, firmware often writes a value with one bit set to acknowledge the desired interrupt while leaving any other pending interrupts untouched.
<blockquote>
<pre>RegWrite (regB, 0x10); /* Ack bit 4 */</pre>
</blockquote>
Mixing the two types of bits in the same register could cause problems. Using the read/write code on interrupt bits causes pending interrupts to inadvertently be acknowledged. Using the interrupt code on read/write bits clears all read/write bits that used to be set. Firmware must take special care to ensure that it does not inadvertently change the wrong bits.

Here is a code fragment that acknowledges bit 4 while taking care not to acknowledge other pending interrupts or modify any read/write bits. In this example, read/write bits are located in positions <code>0x0f</code> and interrupt bits are located in positions <code>0x70</code>.
<blockquote>
<pre>tmp = RegRead (regC);  /* Get the register contents */
tmp &amp;= 0x0f;           /* Keep R/W bits but zero any intr bits */
tmp |= 0x10;           /* Set bit 4 to ack */
RegWrite (regC, tmp);  /* Write it back out */</pre>
</blockquote>
Acknowledging an interrupt changed from a one-step to a four-step operation. A similar code fragment is needed to modify desired read/write bits while leaving alone any pending interrupts.

While there is a way for firmware to safely handle this, it is out of the ordinary and prone to firmware defects. Combining different types of bits into the same register may save registers but it adds unnecessary burden and complexity to firmware. Looking ahead and anticipating the firmware impact can lead to a more reliable and robust solution of placing different types of bits in separate registers.



<blockquote><strong>Best Practice:</strong> Segregate different types of bits (read/write, read-only, interrupt, etc.) into different registers.</blockquote>



If necessary, read-only bits could be combined with any one of the other types of bits. This is acceptable because no matter how the other bits are handled, firmware writes to the register will not affect the read-only bits.

Until the next bit...]]></description>
                <content:encoded><![CDATA[Several years ago I came across a register that, at first glance, seemed to be a typical register with several read/write bits. Upon closer inspection, I saw that one bit behaved differently, depending on its state and whether I was writing a 1 or a 0 to it. Since firmware had to handle that one bit differently from the other bits in that register, I saw that it would be difficult for firmware to safely handle both types in the same register. I was able to reduce the risk of mishandling that register by writing macros with generous comments. However, I could not ensure that everybody in the future would know that it was not a typical read/write register and treat it as such.

Although the engineer had valid reasons for designing the register that way, he or she had not anticipated the impact on firmware of mixing different types of bits in the same register. To avoid complexity and risk of firmware defects, different types of bits should be located in different registers. To see why, let’s examine how firmware manages two types of bits – read/write bits and interrupt bits.

With read/write bits, firmware sets and clears bits when needed. It typically first reads the register, modifies the desired bit, then writes the modified value back out. Here is a sample code fragment:
<blockquote>
<pre>tmp = ReadReg (regA);  /* Get the register contents */
tmp |= 0x01;           /* Set bit 0 */
RegWrite (regA, tmp);  /* Write it back out */</pre>
</blockquote>
In the case of interrupt bits, firmware often writes a value with one bit set to acknowledge the desired interrupt while leaving any other pending interrupts untouched.
<blockquote>
<pre>RegWrite (regB, 0x10); /* Ack bit 4 */</pre>
</blockquote>
Mixing the two types of bits in the same register could cause problems. Using the read/write code on interrupt bits causes pending interrupts to inadvertently be acknowledged. Using the interrupt code on read/write bits clears all read/write bits that used to be set. Firmware must take special care to ensure that it does not inadvertently change the wrong bits.

Here is a code fragment that acknowledges bit 4 while taking care not to acknowledge other pending interrupts or modify any read/write bits. In this example, read/write bits are located in positions <code>0x0f</code> and interrupt bits are located in positions <code>0x70</code>.
<blockquote>
<pre>tmp = RegRead (regC);  /* Get the register contents */
tmp &amp;= 0x0f;           /* Keep R/W bits but zero any intr bits */
tmp |= 0x10;           /* Set bit 4 to ack */
RegWrite (regC, tmp);  /* Write it back out */</pre>
</blockquote>
Acknowledging an interrupt changed from a one-step to a four-step operation. A similar code fragment is needed to modify desired read/write bits while leaving alone any pending interrupts.

While there is a way for firmware to safely handle this, it is out of the ordinary and prone to firmware defects. Combining different types of bits into the same register may save registers but it adds unnecessary burden and complexity to firmware. Looking ahead and anticipating the firmware impact can lead to a more reliable and robust solution of placing different types of bits in separate registers.



<blockquote><strong>Best Practice:</strong> Segregate different types of bits (read/write, read-only, interrupt, etc.) into different registers.</blockquote>



If necessary, read-only bits could be combined with any one of the other types of bits. This is acceptable because no matter how the other bits are handled, firmware writes to the register will not affect the read-only bits.

Until the next bit...]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/embedded-bridge/2010/03/different-bit-types-in-different-registers/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Hardware vs. firmware naming conventions</title>
				<link>http://embeddedgurus.com/stack-overflow/2010/03/hardware-vs-firmware-naming-conventions/</link>
				<comments>http://embeddedgurus.com/stack-overflow/2010/03/hardware-vs-firmware-naming-conventions/#comments</comments>
				<pubDate>Sun, 28 Mar 2010 16:01:32 +0000</pubDate>
				<dc:creator>Nigel Jones</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/stack-overflow/2010/03/hardware-vs-firmware-naming-conventions/</guid>
                <description><![CDATA[Today's post is motivated in part by Gary Stringham. Gary is the newest member of EmbeddedGurus and he consults and <a href="http://embeddedgurus.com/embedded-bridge/">blogs </a>on what he calls the bridge between hardware and firmware. Since I work on both hardware and firmware, I'm looking forward to what Gary has to say in the coming months. Anyway, I'd recently read his posting on <a href="http://embeddedgurus.com/embedded-bridge/2010/03/early-hardwarefirmware-collaboration/">Early hardware / firmware collaboration</a> when I found myself looking at a fairly complex schematic. The microprocessor had a lot of IO pins, most of which were being used. When I looked at the code to gain insight on how some of the IO was being used I found that the hardware engineer and firmware engineer had adopted completely different naming conventions. For example, what appeared on the schematic as "Relay 6" appeared in the code as "ALARM_RELAY_2". As a result the only way I could reconcile the schematic and the code was to look at a signal's port pin assignment on the schematic and then search the code to see what name was associated with that port pin. After I'd done this a few times, I realized I needed a more systematic approach and ended up going through all the port pin assignments in the code and using them to hand mark up the schematic. Clearly this was not only a colossal time waster, it also had the potential for introducing stupid bugs.

So how had this come about? Well if you have ever designed hardware, you will know that naming nets is essentially optional. In other words one can create a perfectly correct schematic without naming any of the nets. Instead all you have to do is ensure that their connectivity is correct. (This is loosely analogous in firmware to referring to variables via their absolute addresses instead of assigning a name to the variable and using it. However, the consequences for the hardware design are nowhere near as dire). Furthermore, if the engineer does decide to name a net, then in most schematic packages I've seen, one is free to use virtually any combination of characters. For example "~LED A" would be a perfectly valid net name - but is most definitely not a valid C variable name. If one throws in the usual issue of numbering things from zero or one (should the first of four LED's be named LED0 or LED1?), together with hardware engineer's frequent (and understandable) desire to indicate if a signal is active low or active high by using some form of naming convention, then one has the recipe for a real mess.

So what's to be done? Well here are my suggestions:
<ol>
	<li>The hardware team should have a rigorously enforced naming standards convention (in much the same way that most companies have a coding standards manual).</li>
	<li>All nets that are used by firmware must be named on the schematic.</li>
	<li>The net names must adhere to the C standard for naming variables.</li>
	<li>The firmware must use the identical name to that appearing on the schematic.</li>
</ol>
Clearly this can be facilitated by having very early meetings between the hardware and firmware teams, such that when the first version of the schematic is released, there is complete agreement on the net names. If you read Gary's blog post you'll see that this is his point too - albeit in a slightly different field.
<a href="http://www.embeddedgurus.com/stack-overflow/">Home</a>]]></description>
                <content:encoded><![CDATA[Today's post is motivated in part by Gary Stringham. Gary is the newest member of EmbeddedGurus and he consults and <a href="http://embeddedgurus.com/embedded-bridge/">blogs </a>on what he calls the bridge between hardware and firmware. Since I work on both hardware and firmware, I'm looking forward to what Gary has to say in the coming months. Anyway, I'd recently read his posting on <a href="http://embeddedgurus.com/embedded-bridge/2010/03/early-hardwarefirmware-collaboration/">Early hardware / firmware collaboration</a> when I found myself looking at a fairly complex schematic. The microprocessor had a lot of IO pins, most of which were being used. When I looked at the code to gain insight on how some of the IO was being used I found that the hardware engineer and firmware engineer had adopted completely different naming conventions. For example, what appeared on the schematic as "Relay 6" appeared in the code as "ALARM_RELAY_2". As a result the only way I could reconcile the schematic and the code was to look at a signal's port pin assignment on the schematic and then search the code to see what name was associated with that port pin. After I'd done this a few times, I realized I needed a more systematic approach and ended up going through all the port pin assignments in the code and using them to hand mark up the schematic. Clearly this was not only a colossal time waster, it also had the potential for introducing stupid bugs.

So how had this come about? Well if you have ever designed hardware, you will know that naming nets is essentially optional. In other words one can create a perfectly correct schematic without naming any of the nets. Instead all you have to do is ensure that their connectivity is correct. (This is loosely analogous in firmware to referring to variables via their absolute addresses instead of assigning a name to the variable and using it. However, the consequences for the hardware design are nowhere near as dire). Furthermore, if the engineer does decide to name a net, then in most schematic packages I've seen, one is free to use virtually any combination of characters. For example "~LED A" would be a perfectly valid net name - but is most definitely not a valid C variable name. If one throws in the usual issue of numbering things from zero or one (should the first of four LED's be named LED0 or LED1?), together with hardware engineer's frequent (and understandable) desire to indicate if a signal is active low or active high by using some form of naming convention, then one has the recipe for a real mess.

So what's to be done? Well here are my suggestions:
<ol>
	<li>The hardware team should have a rigorously enforced naming standards convention (in much the same way that most companies have a coding standards manual).</li>
	<li>All nets that are used by firmware must be named on the schematic.</li>
	<li>The net names must adhere to the C standard for naming variables.</li>
	<li>The firmware must use the identical name to that appearing on the schematic.</li>
</ol>
Clearly this can be facilitated by having very early meetings between the hardware and firmware teams, such that when the first version of the schematic is released, there is complete agreement on the net names. If you read Gary's blog post you'll see that this is his point too - albeit in a slightly different field.
<a href="http://www.embeddedgurus.com/stack-overflow/">Home</a>]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/stack-overflow/2010/03/hardware-vs-firmware-naming-conventions/feed/</wfw:commentRss>
			</item>
						<item>
				<title>How to Set the Size of your C Stack</title>
				<link>http://embeddedgurus.com/barr-code/2010/03/how-to-set-the-size-of-your-c-stack/</link>
				<comments>http://embeddedgurus.com/barr-code/2010/03/how-to-set-the-size-of-your-c-stack/#comments</comments>
				<pubDate>Wed, 24 Mar 2010 18:16:43 +0000</pubDate>
				<dc:creator>Michael Barr</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/barr-code/2010/03/how-to-set-the-size-of-your-c-stack/</guid>
                <description><![CDATA[<em>A reader of my monthly <a href="http://www.firmwareupdate.net">Firmware Update newsletter</a> recently sent an e-mail to ask:</em>

<blockquote>
I am a firmware engineer.  I read your <a href="/barr-code/2010/03/firmware-specific-bug-4-stack-overflow/">recent blog post regarding the C stack</a>, about which I have two questions: First, how can I increment or decrement the size of the stack in my code?  Second, what size should I choose?
</blockquote>

<em>Here's what I told him:</em>

The size of the stack is set either in the linker command file or in the C or C++ <a href="http://www.netrino.com/Embedded-Systems/Glossary-S#startup_code">startup code</a>.  You should be able to learn more about how to change the stack size from your specific compiler vendor's manual or customer support.

Identifying the minimum stack size required for your specific application is made challenging by these stubborn facts:

- MEASURING the maximum stack growth during testing may not be sufficient.  If you test for half a year, the product is sure to be run for a year or longer in the field.  Have you really tested all possible cases?  What about all possible series of interrupt service routines on top of that worst case use by main()?

- TOP DOWN ANALYSIS of the compiled code can be done to determine the number of function calls and interrupt service routines at maximum depth; their individual parameter and local variable use, etc.  Unfortunately, these things may keep changing whenever you change the code and recompile.

The best approach is usually to perform a conservative top down analysis of the source code; when in doubt, always round up.  Don't forget about nested interrupt service routines.  Double that conservative to set your initial stack budget.  Then measure actual stack utilization during testing, preferably with code coverage analysis tools running--to ensure that you've tested all possible paths (except interrupts, which may run at different times in the field).

Then if you need to reclaim memory to ship the product, start shrinking the stack.  But also put into place a high water mark system (e.g., 0xDEADBEEF) complete with supervisor code to put the product into a failsafe state if more than, for example, 80% of the stack is ever used.]]></description>
                <content:encoded><![CDATA[<em>A reader of my monthly <a href="http://www.firmwareupdate.net">Firmware Update newsletter</a> recently sent an e-mail to ask:</em>

<blockquote>
I am a firmware engineer.  I read your <a href="/barr-code/2010/03/firmware-specific-bug-4-stack-overflow/">recent blog post regarding the C stack</a>, about which I have two questions: First, how can I increment or decrement the size of the stack in my code?  Second, what size should I choose?
</blockquote>

<em>Here's what I told him:</em>

The size of the stack is set either in the linker command file or in the C or C++ <a href="http://www.netrino.com/Embedded-Systems/Glossary-S#startup_code">startup code</a>.  You should be able to learn more about how to change the stack size from your specific compiler vendor's manual or customer support.

Identifying the minimum stack size required for your specific application is made challenging by these stubborn facts:

- MEASURING the maximum stack growth during testing may not be sufficient.  If you test for half a year, the product is sure to be run for a year or longer in the field.  Have you really tested all possible cases?  What about all possible series of interrupt service routines on top of that worst case use by main()?

- TOP DOWN ANALYSIS of the compiled code can be done to determine the number of function calls and interrupt service routines at maximum depth; their individual parameter and local variable use, etc.  Unfortunately, these things may keep changing whenever you change the code and recompile.

The best approach is usually to perform a conservative top down analysis of the source code; when in doubt, always round up.  Don't forget about nested interrupt service routines.  Double that conservative to set your initial stack budget.  Then measure actual stack utilization during testing, preferably with code coverage analysis tools running--to ensure that you've tested all possible paths (except interrupts, which may run at different times in the field).

Then if you need to reclaim memory to ship the product, start shrinking the stack.  But also put into place a high water mark system (e.g., 0xDEADBEEF) complete with supervisor code to put the product into a failsafe state if more than, for example, 80% of the stack is ever used.]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/barr-code/2010/03/how-to-set-the-size-of-your-c-stack/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Toyota's Embedded Software Image Problem</title>
				<link>http://embeddedgurus.com/barr-code/2010/03/toyotas-embedded-software-image-problem/</link>
				<comments>http://embeddedgurus.com/barr-code/2010/03/toyotas-embedded-software-image-problem/#comments</comments>
				<pubDate>Fri, 19 Mar 2010 21:02:17 +0000</pubDate>
				<dc:creator>Michael Barr</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/barr-code/2010/03/toyotas-embedded-software-image-problem/</guid>
                <description><![CDATA[It remains unclear whether Toyota's <a href="http://www.washingtonpost.com/wp-dyn/content/graphic/2010/03/10/GR2010031004046.html">higher-than-industry-average number of complaints</a> regarding <a href="http://en.wikipedia.org/wiki/Sudden_unintended_acceleration">sudden unintended acceleration</a> (SUA) is caused (in whole or in part) by an embedded software problem.  But whether it is or it isn't actually firmware, the company has clearly denied it and yet still developed an embedded software "image problem".  They've brought some of this on themselves.

<em>Side Note</em>: I think it is a net positive that journalists, the mass media, and a broader swath of the general public are increasingly aware that there is software embedded inside cars, airplanes, medical devices, and just about everything else with a power supply or batteries.  Firmware has been inside these products for many years, of course.  But as I wrote in <a href="http://electronicdesign.com/article/embedded-software/faulty_code_will_lead_to_an_era_of_firmware_related_litigation.aspx">a recent article in Electronic Design</a>, my experience working with companies across many industries lead me to believe there is <a href="/barr-code/2010/02/embedded-software-is-the-future-of-product-quality-and-safety/">a looming firmware quality crisis</a>.  Greater public awareness is sure to bring <a href="http://embedded.com/columns/barrcode/221901488">litigation</a>.  This will force engineering management to care more about firmware quality than they currently do.

<strong>Toyota's Firmware Image Problem</strong>

Long before the "floor-mat recall" <a href="http://www.nhtsa.dot.gov">NHTSA</a> had logged a higher number of unintended acceleration complaints (4.51 complaints per 100,000 cars sold for the 2005 to 2010 model years) for Toyota than any other company.  (A recent Washington Post <a href="http://www.washingtonpost.com/wp-dyn/content/graphic/2010/03/10/GR2010031004046.html">graphic</a> has more data.)  Apparently, NHTSA and Toyota were investigating the reports--but hadn't yet taken any action.

It seems that what set that first Toyota recall in motion was a high-profile <a href="http://www.washingtonpost.com/wp-dyn/content/article/2010/01/28/AR2010012803971_pf.html">fatal August 2009 crash involving an off-duty California Highway Patrol office</a>, his family, a runaway Lexus, and a <a href="http://www.entertonement.com/clips/fmnjpnzgmb--Chris-Lastrella-911-Call-Before-Crashing-911-Calls-Chris-Lastrella">disturbing 911 call</a>,  Given the context of that specific crash, I'm not convinced the floor mat recall made much sense.  In particular, I find it hard to believe that a police officer with adrenaline pumping through his veins and his family's life on the line, wouldn't just rip a stuck floor mat out of the way like the Incredible Hulk. (Or that he would choose running off the road at 125 mph vs. shutting the vehicle off entirely.)  But I don't have all the facts about either that specific accident or the reasoning behind the floor mat recall.

The broader recalls that have happened since have focused on also adding mechanical strength to the accelerator pedals in a number of different makes and models.  To this day, Toyota categorically denies any sort of electrical problem.  Yet some cars that have been modified in this way have since been reported to experience unintended acceleration!  Besides which, mechanical parts generally fail visibly or entirely once they first fail--rather than intermittently.  Intermittent failures are far more common with electronics (think EMI) and firmware.

Toyota's firmware image problem stems from two things:  First, they have separately recalled the Prius for a braking-related firmware upgrade.  Other possible <a target="_blank" href="http://www.toyota.com/prius-hybrid/">Prius</a> software issues have been identified by <a target="_blank" href="http://www.youtube.com/watch?v=hc2_yLXy9O4">Steve Wozniak</a> and <a target="_blank" href="http://www.youtube.com/watch?v=Rr6dm0qFRTw">Jim Sikes</a>, but these have not yet been confirmed.  Additionally, the continued reliance (by Toyota and NHTSA) on theories such as "we can't reproduce the problem and we haven't been able to see it during testing" as proof that there's not a software bug is simply unbelievable.  

Anyone who works with software knows from experience that lots of bugs can't be easily reproduced.  The fact that these incidents can't be reproduced is not a proof of anything.

<strong>Software in Cars: The Future</strong>

Don't get me wrong.  I want more software in my car not less.  I very much look forward to the day that an in-car computer takes over the driving for me.  After all, some cars already have more sensor data to make decisions on than the driver does.  Imagine what a car with an integrated GPS navigation system, auto-follow cruise control, and collision avoidance systems could do.  While I guess that I should move left one lane to avoid a crash, the computer is capable of seeing in all directions at once, calculating all of the trajectories of near-by cars, including instantaneous changes in their acceleration or deceleration.

Additionally, I suspect that even with bugs in a car's drive-by-wire software the car may be much safer overall for its electronic traction control and anti-lock braking systems.

I just wish that Toyota would own up to the fact that the inability to reproduce a problem doesn't rule out a software (or EMI) flaw.]]></description>
                <content:encoded><![CDATA[It remains unclear whether Toyota's <a href="http://www.washingtonpost.com/wp-dyn/content/graphic/2010/03/10/GR2010031004046.html">higher-than-industry-average number of complaints</a> regarding <a href="http://en.wikipedia.org/wiki/Sudden_unintended_acceleration">sudden unintended acceleration</a> (SUA) is caused (in whole or in part) by an embedded software problem.  But whether it is or it isn't actually firmware, the company has clearly denied it and yet still developed an embedded software "image problem".  They've brought some of this on themselves.

<em>Side Note</em>: I think it is a net positive that journalists, the mass media, and a broader swath of the general public are increasingly aware that there is software embedded inside cars, airplanes, medical devices, and just about everything else with a power supply or batteries.  Firmware has been inside these products for many years, of course.  But as I wrote in <a href="http://electronicdesign.com/article/embedded-software/faulty_code_will_lead_to_an_era_of_firmware_related_litigation.aspx">a recent article in Electronic Design</a>, my experience working with companies across many industries lead me to believe there is <a href="/barr-code/2010/02/embedded-software-is-the-future-of-product-quality-and-safety/">a looming firmware quality crisis</a>.  Greater public awareness is sure to bring <a href="http://embedded.com/columns/barrcode/221901488">litigation</a>.  This will force engineering management to care more about firmware quality than they currently do.

<strong>Toyota's Firmware Image Problem</strong>

Long before the "floor-mat recall" <a href="http://www.nhtsa.dot.gov">NHTSA</a> had logged a higher number of unintended acceleration complaints (4.51 complaints per 100,000 cars sold for the 2005 to 2010 model years) for Toyota than any other company.  (A recent Washington Post <a href="http://www.washingtonpost.com/wp-dyn/content/graphic/2010/03/10/GR2010031004046.html">graphic</a> has more data.)  Apparently, NHTSA and Toyota were investigating the reports--but hadn't yet taken any action.

It seems that what set that first Toyota recall in motion was a high-profile <a href="http://www.washingtonpost.com/wp-dyn/content/article/2010/01/28/AR2010012803971_pf.html">fatal August 2009 crash involving an off-duty California Highway Patrol office</a>, his family, a runaway Lexus, and a <a href="http://www.entertonement.com/clips/fmnjpnzgmb--Chris-Lastrella-911-Call-Before-Crashing-911-Calls-Chris-Lastrella">disturbing 911 call</a>,  Given the context of that specific crash, I'm not convinced the floor mat recall made much sense.  In particular, I find it hard to believe that a police officer with adrenaline pumping through his veins and his family's life on the line, wouldn't just rip a stuck floor mat out of the way like the Incredible Hulk. (Or that he would choose running off the road at 125 mph vs. shutting the vehicle off entirely.)  But I don't have all the facts about either that specific accident or the reasoning behind the floor mat recall.

The broader recalls that have happened since have focused on also adding mechanical strength to the accelerator pedals in a number of different makes and models.  To this day, Toyota categorically denies any sort of electrical problem.  Yet some cars that have been modified in this way have since been reported to experience unintended acceleration!  Besides which, mechanical parts generally fail visibly or entirely once they first fail--rather than intermittently.  Intermittent failures are far more common with electronics (think EMI) and firmware.

Toyota's firmware image problem stems from two things:  First, they have separately recalled the Prius for a braking-related firmware upgrade.  Other possible <a target="_blank" href="http://www.toyota.com/prius-hybrid/">Prius</a> software issues have been identified by <a target="_blank" href="http://www.youtube.com/watch?v=hc2_yLXy9O4">Steve Wozniak</a> and <a target="_blank" href="http://www.youtube.com/watch?v=Rr6dm0qFRTw">Jim Sikes</a>, but these have not yet been confirmed.  Additionally, the continued reliance (by Toyota and NHTSA) on theories such as "we can't reproduce the problem and we haven't been able to see it during testing" as proof that there's not a software bug is simply unbelievable.  

Anyone who works with software knows from experience that lots of bugs can't be easily reproduced.  The fact that these incidents can't be reproduced is not a proof of anything.

<strong>Software in Cars: The Future</strong>

Don't get me wrong.  I want more software in my car not less.  I very much look forward to the day that an in-car computer takes over the driving for me.  After all, some cars already have more sensor data to make decisions on than the driver does.  Imagine what a car with an integrated GPS navigation system, auto-follow cruise control, and collision avoidance systems could do.  While I guess that I should move left one lane to avoid a crash, the computer is capable of seeing in all directions at once, calculating all of the trajectories of near-by cars, including instantaneous changes in their acceleration or deceleration.

Additionally, I suspect that even with bugs in a car's drive-by-wire software the car may be much safer overall for its electronic traction control and anti-lock braking systems.

I just wish that Toyota would own up to the fact that the inability to reproduce a problem doesn't rule out a software (or EMI) flaw.]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/barr-code/2010/03/toyotas-embedded-software-image-problem/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Reading a register for its side effects in C and C++</title>
				<link>http://embeddedgurus.com/stack-overflow/2010/03/reading-a-register-for-its-side-effects-in-c-and-c/</link>
				<comments>http://embeddedgurus.com/stack-overflow/2010/03/reading-a-register-for-its-side-effects-in-c-and-c/#comments</comments>
				<pubDate>Mon, 15 Mar 2010 19:23:28 +0000</pubDate>
				<dc:creator>Nigel Jones</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/stack-overflow/2010/03/reading-a-register-for-its-side-effects-in-c-and-c/</guid>
                <description><![CDATA[<span style="font-family: Times New Roman;font-size: small">Although  today's post is the first real post on the new EmbeddedGurus, it's special for  another reason. This post is being jointly written with John Regehr. John is an  Associate Professor of Computer Science at the University of Utah and maintains an excellent blog, <a href="http://blog.regehr.org/" target="_blank">Embedded in Academia </a>which I  heartily recommend. This blog posting grew out of a lengthy email exchange which  started with John alerting me to some blatant plagiarism of my work and then  evolved (dissolved?) into what you find here. John is also posting this article  on his blog.</span>

<span style="font-family: Times New Roman;font-size: small">Anyway,  enough preamble, on to the topic at hand.</span>

Once in awhile one finds oneself having to read a device register, but without needing nor caring what the value of the register is. A typical scenario is as follows. You have written some sort of asynchronous communications driver. The driver is set up to generate an interrupt upon receipt of a  character. In the ISR, the code first of all examines a status register to see if the character has been received correctly (e.g. no framing, parity or overrun errors). If an error has occurred, what should the code do? Well, in just about every system we have worked on, it is necessary to read the register that contains the received character -- even though the character is useless. If you don't perform the read, then you will almost certainly get an overrun error on the next character. Thus you find yourself in the position of having to read a register even though its value is useless. The  question then becomes, how does one do this in C? In the following examples, assume that SBUF is the register holding the data to be discarded and that SBUF is understood to be <em>volatile</em>. The exact semantics of the declaration of SBUF vary from compiler to compiler.

If you are programming in C and if your compiler correctly supports the volatile qualifier, then this simple code suffices:
<pre>void cload_reg1 (void)
{
   SBUF;
}</pre>
This certainly looks a little strange, but it is completely legal C and should generate the requisite read, and nothing more.  For example, at the -Os optimization level, the MSP430 port of GCC gives this code:
<pre>cload_reg1:
    mov &amp;SBUF, r15
    ret</pre>
Unfortunately, there are two practical problems with this C code. First, quite a few C compilers incorrectly translate this code, although the C standard gives it an unambiguous meaning.  We tested the code on a variety of general-purpose and embedded compilers, and present the results below. These results are a little depressing.

The second problem is even scarier.  The problem is that the C++ standard is not 100% clear about what the code above means.  On one hand, the standard says this:
<blockquote>In general, the semantics of volatile are intended to be the same in C++ as they are in C.</blockquote>
A number of C++ compilers, including GCC and LLVM, generate the same code for cload_reg1() when compiling in C++ mode as they do in C mode.  On the other hand, several high-quality C++ compilers, such as those from ARM, Intel, and IAR, turn the function cload_reg1() into object code that does nothing.  We discussed this issue with people from the compiler groups at Intel and IAR, and both gave essentially the same response.  Here we quote (with permission) from the Intel folks:
<blockquote>The operation that turns into a load instruction in the executable code is what the C++ standard calls the lvalue-to-rvalue conversion; it converts an lvalue (which identifies an object, which resides in memory and has an address) into an rvalue (or just value; something whose address can't be taken and can be in a register). The C++ standard is very clear and explicit about where the lvalue-to-rvalue conversion happens. Basically, it happens for most operands of most operators - but of course not for the left operand of assignment, or the operand of unary ampersand, for example. The top-level expression of an expression statement, which is of course not the operand of any operator, is not a context where the lvalue-to-rvalue conversion happens.

In the C standard, the situation is somewhat different. The C standard has a list of the contexts where the lvalue-to-rvalue conversion doesn't happen, and that list doesn't include appearing as the expression in an expression-statement.

So we're doing exactly what the various standards say to do. It's not a matter of the C++ standard allowing the volatile reference to be optimized away; in C++, the standard requires that it not happen in the first place.</blockquote>
We think the last sentence sums it up beautifully. How many readers were aware that the semantics for the volatile qualifier are significantly different between C and C++? The additional implication is that as shown below, GCC, the Microsoft compiler, and Open64, when compiling C++ code, are in error.

We asked about this on the GCC mailing list and received only one response which was basically "Why should we change the semantics, since this will break working code?"  This is a fair point.  Frankly speaking, the semantics of volatile in C are a bit of mess and C++ makes the situation much worse by permitting reasonable people to interpret it in two totally different ways.
<h2>Experimental Results</h2>
To test C and C++ compilers, we compiled the following two functions to object code at a reasonably high level of optimization:
<pre>extern volatile unsigned char foo;</pre>
<pre>void cload_reg1 (void)
{
   foo;
}</pre>
<pre>void cload_reg2 (void)
{
   volatile unsigned char sink;
   sink = foo;
}</pre>
For embedded compilers that have built-in support for accessing hardware registers, we tested two additional functions where as above, SBUF is understood to be a hardware register defined by the semantics of the compiler under test:
<pre>void cload_reg3 (void)
{
   SBUF;
}

void cload_reg4 (void)
{
   volatile unsigned char sink;
   sink = SBUF;
}</pre>
The results were as follows.
<h3>GCC</h3>
We tested version 4.4.1, hosted on x86 Linux and also targeting x86 Linux, using optimization level -Os.  The C compiler loads from foo in both cload_reg1() and cload_reg2() .  No warnings are generated.  The C++ compiler shows the same behavior as the C compiler.
<h3>Intel Compiler</h3>
We tested icc version 11.1, hosted on x86 Linux and also targeting x86 Linux, using optimization level -Os.  The C compiler emits code loading from foo for both cload_reg1() and cload_reg2(), without giving any warnings. The C++ compiler emits a warning "expression has no effect" for cload_reg1() and this function does not load from foo.  cload_reg2() does load from foo and gives no warnings.
<h3>Sun Compiler</h3>
We tested suncc version 5.10, hosted on x86 Linux and also targeting x86 Linux, using optimization level -O.  The C compiler does not load from foo in cload_reg1(), nor does it emit any warning.  It does load from foo in cload_reg2().  The C++ compiler has the same behavior as the C compiler.
<h3>x86-Open64</h3>
We tested opencc version 4.2.3, hosted on x86 Linux and also targeting x86 Linux, using optimization level -Os.  The C compiler does not load from foo in cload_reg1(), nor does it emit any warning.  It does load from foo in cload_reg2().  The C++ compiler has the same behavior as the C compiler.
<h3>LLVM / Clang</h3>
We tested subversion rev 98508, which is between versions 2.6 and 2.7, hosted on x86 Linux and also targeting x86 Linux, using optimization level -Os.  The C compiler loads from foo in both cload_reg1() and cload_reg2() .
A warning about unused value is generated for cload_reg1().  The C++ compiler shows the same behavior as the C compiler.
<h3>CrossWorks for MSP430</h3>
We tested version 2.0.8.2009062500.4974, hosted on x86 Linux, using optimization level -O.  This compiler supports only C.  foo was not loaded in cload_reg1(), but it was loaded in cload_reg2().
<h3>IAR for AVR</h3>
We tested version 5.30.6.50191, hosted on Windows XP, using maximum speed optimization.  The C compiler performed the load  in all four cases. The C++ compiler did not perform the load for cload_reg1() or cload_reg3(),
but did for cload_reg2() and cload_reg4().
<h3>Keil 8051</h3>
We tested version 8.01, hosted on Windows XP, using optimization level 8, configured to favor speed. The Keil compiler failed to generate the required load in cload_reg1() (but did give at least give a warning), yet did perform the load in all other cases including cload_reg3() suggesting that for the Keil compiler, its IO register (SFR) semantics are treated differently to volatile variable semantics.
<h3>HI-TECH for PIC16</h3>
We tested version 9.70, hosted on Windows XP, using Global optimization level 9, configured to favor speed. This was very interesting in that the results were almost a mirror image to the Keil compiler. In this case the load was performed in all cases except cload_reg3(). Thus the HI-TECH semantics for IO registers and volatile variables also appears to be different - just the opposite to Keil!  No warnings was generated by the Hi-TECH compiler when it failed to generate code.
<h3>Microchip Compiler for PIC18</h3>
We tested version 3.35, hosted on Windows XP, using full optimization level. This rounded out the group of embedded compilers quite nicely in that it didn't perform the load in either cload_reg1() or cload_reg3() - but did in the rest. It also failed to warn about the statements having no effect. This was the worst performing of all the compilers we tested.
<h2>Summary</h2>
The level of non-conformance with the C compilers, together with the genuine uncertainty as to what the C++ compilers should do provides a real quandary. If you need the most efficient code possible, then you have no option other than to investigate what your compiler does. If you are looking for a generally reliable and portable solution, then the methodology in cload_reg2() is probably your best bet. However it would be just that: a bet. Naturally, we (and the other readers of this blog) would be very interested to hear what your compiler does. So if you have a few minutes, please run the sample code through your compiler and let us know the results.
<h2>Acknowledgments</h2>
We'd like to thank Hans Boehm at HP, Arch Robison at Intel, and the compiler groups at both Intel and IAR for their valuable feedback that helped us construct this post.  Any mistakes are, of course, ours.
<a href="http://www.embeddedgurus.com/stack-overflow/">Home</a>]]></description>
                <content:encoded><![CDATA[<span style="font-family: Times New Roman;font-size: small">Although  today's post is the first real post on the new EmbeddedGurus, it's special for  another reason. This post is being jointly written with John Regehr. John is an  Associate Professor of Computer Science at the University of Utah and maintains an excellent blog, <a href="http://blog.regehr.org/" target="_blank">Embedded in Academia </a>which I  heartily recommend. This blog posting grew out of a lengthy email exchange which  started with John alerting me to some blatant plagiarism of my work and then  evolved (dissolved?) into what you find here. John is also posting this article  on his blog.</span>

<span style="font-family: Times New Roman;font-size: small">Anyway,  enough preamble, on to the topic at hand.</span>

Once in awhile one finds oneself having to read a device register, but without needing nor caring what the value of the register is. A typical scenario is as follows. You have written some sort of asynchronous communications driver. The driver is set up to generate an interrupt upon receipt of a  character. In the ISR, the code first of all examines a status register to see if the character has been received correctly (e.g. no framing, parity or overrun errors). If an error has occurred, what should the code do? Well, in just about every system we have worked on, it is necessary to read the register that contains the received character -- even though the character is useless. If you don't perform the read, then you will almost certainly get an overrun error on the next character. Thus you find yourself in the position of having to read a register even though its value is useless. The  question then becomes, how does one do this in C? In the following examples, assume that SBUF is the register holding the data to be discarded and that SBUF is understood to be <em>volatile</em>. The exact semantics of the declaration of SBUF vary from compiler to compiler.

If you are programming in C and if your compiler correctly supports the volatile qualifier, then this simple code suffices:
<pre>void cload_reg1 (void)
{
   SBUF;
}</pre>
This certainly looks a little strange, but it is completely legal C and should generate the requisite read, and nothing more.  For example, at the -Os optimization level, the MSP430 port of GCC gives this code:
<pre>cload_reg1:
    mov &amp;SBUF, r15
    ret</pre>
Unfortunately, there are two practical problems with this C code. First, quite a few C compilers incorrectly translate this code, although the C standard gives it an unambiguous meaning.  We tested the code on a variety of general-purpose and embedded compilers, and present the results below. These results are a little depressing.

The second problem is even scarier.  The problem is that the C++ standard is not 100% clear about what the code above means.  On one hand, the standard says this:
<blockquote>In general, the semantics of volatile are intended to be the same in C++ as they are in C.</blockquote>
A number of C++ compilers, including GCC and LLVM, generate the same code for cload_reg1() when compiling in C++ mode as they do in C mode.  On the other hand, several high-quality C++ compilers, such as those from ARM, Intel, and IAR, turn the function cload_reg1() into object code that does nothing.  We discussed this issue with people from the compiler groups at Intel and IAR, and both gave essentially the same response.  Here we quote (with permission) from the Intel folks:
<blockquote>The operation that turns into a load instruction in the executable code is what the C++ standard calls the lvalue-to-rvalue conversion; it converts an lvalue (which identifies an object, which resides in memory and has an address) into an rvalue (or just value; something whose address can't be taken and can be in a register). The C++ standard is very clear and explicit about where the lvalue-to-rvalue conversion happens. Basically, it happens for most operands of most operators - but of course not for the left operand of assignment, or the operand of unary ampersand, for example. The top-level expression of an expression statement, which is of course not the operand of any operator, is not a context where the lvalue-to-rvalue conversion happens.

In the C standard, the situation is somewhat different. The C standard has a list of the contexts where the lvalue-to-rvalue conversion doesn't happen, and that list doesn't include appearing as the expression in an expression-statement.

So we're doing exactly what the various standards say to do. It's not a matter of the C++ standard allowing the volatile reference to be optimized away; in C++, the standard requires that it not happen in the first place.</blockquote>
We think the last sentence sums it up beautifully. How many readers were aware that the semantics for the volatile qualifier are significantly different between C and C++? The additional implication is that as shown below, GCC, the Microsoft compiler, and Open64, when compiling C++ code, are in error.

We asked about this on the GCC mailing list and received only one response which was basically "Why should we change the semantics, since this will break working code?"  This is a fair point.  Frankly speaking, the semantics of volatile in C are a bit of mess and C++ makes the situation much worse by permitting reasonable people to interpret it in two totally different ways.
<h2>Experimental Results</h2>
To test C and C++ compilers, we compiled the following two functions to object code at a reasonably high level of optimization:
<pre>extern volatile unsigned char foo;</pre>
<pre>void cload_reg1 (void)
{
   foo;
}</pre>
<pre>void cload_reg2 (void)
{
   volatile unsigned char sink;
   sink = foo;
}</pre>
For embedded compilers that have built-in support for accessing hardware registers, we tested two additional functions where as above, SBUF is understood to be a hardware register defined by the semantics of the compiler under test:
<pre>void cload_reg3 (void)
{
   SBUF;
}

void cload_reg4 (void)
{
   volatile unsigned char sink;
   sink = SBUF;
}</pre>
The results were as follows.
<h3>GCC</h3>
We tested version 4.4.1, hosted on x86 Linux and also targeting x86 Linux, using optimization level -Os.  The C compiler loads from foo in both cload_reg1() and cload_reg2() .  No warnings are generated.  The C++ compiler shows the same behavior as the C compiler.
<h3>Intel Compiler</h3>
We tested icc version 11.1, hosted on x86 Linux and also targeting x86 Linux, using optimization level -Os.  The C compiler emits code loading from foo for both cload_reg1() and cload_reg2(), without giving any warnings. The C++ compiler emits a warning "expression has no effect" for cload_reg1() and this function does not load from foo.  cload_reg2() does load from foo and gives no warnings.
<h3>Sun Compiler</h3>
We tested suncc version 5.10, hosted on x86 Linux and also targeting x86 Linux, using optimization level -O.  The C compiler does not load from foo in cload_reg1(), nor does it emit any warning.  It does load from foo in cload_reg2().  The C++ compiler has the same behavior as the C compiler.
<h3>x86-Open64</h3>
We tested opencc version 4.2.3, hosted on x86 Linux and also targeting x86 Linux, using optimization level -Os.  The C compiler does not load from foo in cload_reg1(), nor does it emit any warning.  It does load from foo in cload_reg2().  The C++ compiler has the same behavior as the C compiler.
<h3>LLVM / Clang</h3>
We tested subversion rev 98508, which is between versions 2.6 and 2.7, hosted on x86 Linux and also targeting x86 Linux, using optimization level -Os.  The C compiler loads from foo in both cload_reg1() and cload_reg2() .
A warning about unused value is generated for cload_reg1().  The C++ compiler shows the same behavior as the C compiler.
<h3>CrossWorks for MSP430</h3>
We tested version 2.0.8.2009062500.4974, hosted on x86 Linux, using optimization level -O.  This compiler supports only C.  foo was not loaded in cload_reg1(), but it was loaded in cload_reg2().
<h3>IAR for AVR</h3>
We tested version 5.30.6.50191, hosted on Windows XP, using maximum speed optimization.  The C compiler performed the load  in all four cases. The C++ compiler did not perform the load for cload_reg1() or cload_reg3(),
but did for cload_reg2() and cload_reg4().
<h3>Keil 8051</h3>
We tested version 8.01, hosted on Windows XP, using optimization level 8, configured to favor speed. The Keil compiler failed to generate the required load in cload_reg1() (but did give at least give a warning), yet did perform the load in all other cases including cload_reg3() suggesting that for the Keil compiler, its IO register (SFR) semantics are treated differently to volatile variable semantics.
<h3>HI-TECH for PIC16</h3>
We tested version 9.70, hosted on Windows XP, using Global optimization level 9, configured to favor speed. This was very interesting in that the results were almost a mirror image to the Keil compiler. In this case the load was performed in all cases except cload_reg3(). Thus the HI-TECH semantics for IO registers and volatile variables also appears to be different - just the opposite to Keil!  No warnings was generated by the Hi-TECH compiler when it failed to generate code.
<h3>Microchip Compiler for PIC18</h3>
We tested version 3.35, hosted on Windows XP, using full optimization level. This rounded out the group of embedded compilers quite nicely in that it didn't perform the load in either cload_reg1() or cload_reg3() - but did in the rest. It also failed to warn about the statements having no effect. This was the worst performing of all the compilers we tested.
<h2>Summary</h2>
The level of non-conformance with the C compilers, together with the genuine uncertainty as to what the C++ compilers should do provides a real quandary. If you need the most efficient code possible, then you have no option other than to investigate what your compiler does. If you are looking for a generally reliable and portable solution, then the methodology in cload_reg2() is probably your best bet. However it would be just that: a bet. Naturally, we (and the other readers of this blog) would be very interested to hear what your compiler does. So if you have a few minutes, please run the sample code through your compiler and let us know the results.
<h2>Acknowledgments</h2>
We'd like to thank Hans Boehm at HP, Arch Robison at Intel, and the compiler groups at both Intel and IAR for their valuable feedback that helped us construct this post.  Any mistakes are, of course, ours.
<a href="http://www.embeddedgurus.com/stack-overflow/">Home</a>]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/stack-overflow/2010/03/reading-a-register-for-its-side-effects-in-c-and-c/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Firmware-Specific Bug #5: Heap Fragmentation</title>
				<link>http://embeddedgurus.com/barr-code/2010/03/firmware-specific-bug-5-heap-fragmentation/</link>
				<comments>http://embeddedgurus.com/barr-code/2010/03/firmware-specific-bug-5-heap-fragmentation/#comments</comments>
				<pubDate>Mon, 15 Mar 2010 16:55:48 +0000</pubDate>
				<dc:creator>Michael Barr</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/barr-code/2010/03/firmware-specific-bug-5-heap-fragmentation/</guid>
                <description><![CDATA[Dynamic memory allocation is not widely used by embedded software developers—and for good reasons.  One of those is the problem of fragmentation of the heap.

All data structures created via C’s malloc() standard library routine or C++’s <code>new</code> keyword live on the heap.  The heap is a specific area in RAM of a pre-determined maximum size.  Initially, each allocation from the heap reduces the amount of remaining “free” space by the same number of bytes.  For example, the heap in a particular system might span 10 KB starting from address 0x20200000.  An allocation of a pair of 4-KB data structures would leave 2 KB of free space.

The storage for data structures that are no longer needed can be returned to the heap by a call to free() or use of the <code>delete</code> keyword.  In theory this makes that storage space available for reuse during subsequent allocations.  But the order of allocations and deletions is generally at least pseudo-random—leading the heap to become a mess of smaller fragments.

To see how fragmentation can be a problem, consider what would happen if the first of the above 4 KB data structures is free.  Now the heap consists of one 4-KB free chunk and another 2-KB free chunk; they are not adjacent and cannot be combined.  So our heap is already fragmented.  Despite 6 KB of total free space, allocations of more than 4 KB will fail.

Fragmentation is similar to entropy: both increase over time.  In a long running system (i.e., most every embedded system ever created), fragmentation may eventually cause some allocation requests to fail.  And what then?  How should your firmware handle the case of a failed heap allocation request?

<em>Best Practice</em>: Avoiding all use of the heap may is a sure way of preventing this bug.  But if dynamic memory allocation is either necessary or convenient in your system, there is an alternative way of structuring the heap that will prevent fragmentation.  The key observation is that the problem is caused by variable sized requests.  

If all of the requests were of the same size, then any free block is as good as any other—even if it happens not to be adjacent to any of the other free blocks.  Thus it is possible to use multiple “heaps”—each for allocation requests of a specific size—can using a “memory pool” data structure.

If you like you can write your own fixed-sized memory pool API.  You’ll just need three functions:

<ul>
	<li>handle = pool_create(block_size, num_blocks) - to create a new pool (of size M chunks by N bytes);</li>
	<li>p_block = pool_alloc(handle) - to allocate one chunk (from a specified pool); and</li>
  	<li>pool_free(handle, p_block).</li>
</ul>

But note that many real-time operating systems (RTOSes) feature a fixed-size memory pool API.  If you have access to one of those, use it instead of the compiler's malloc() and free() or your own implementation.

<a href="/barr-code/2010/03/firmware-specific-bug-4-stack-overflow/">Firmware-Specific Bug #4</a>

Firmware-Specific Bug #6 (coming soon)]]></description>
                <content:encoded><![CDATA[Dynamic memory allocation is not widely used by embedded software developers—and for good reasons.  One of those is the problem of fragmentation of the heap.

All data structures created via C’s malloc() standard library routine or C++’s <code>new</code> keyword live on the heap.  The heap is a specific area in RAM of a pre-determined maximum size.  Initially, each allocation from the heap reduces the amount of remaining “free” space by the same number of bytes.  For example, the heap in a particular system might span 10 KB starting from address 0x20200000.  An allocation of a pair of 4-KB data structures would leave 2 KB of free space.

The storage for data structures that are no longer needed can be returned to the heap by a call to free() or use of the <code>delete</code> keyword.  In theory this makes that storage space available for reuse during subsequent allocations.  But the order of allocations and deletions is generally at least pseudo-random—leading the heap to become a mess of smaller fragments.

To see how fragmentation can be a problem, consider what would happen if the first of the above 4 KB data structures is free.  Now the heap consists of one 4-KB free chunk and another 2-KB free chunk; they are not adjacent and cannot be combined.  So our heap is already fragmented.  Despite 6 KB of total free space, allocations of more than 4 KB will fail.

Fragmentation is similar to entropy: both increase over time.  In a long running system (i.e., most every embedded system ever created), fragmentation may eventually cause some allocation requests to fail.  And what then?  How should your firmware handle the case of a failed heap allocation request?

<em>Best Practice</em>: Avoiding all use of the heap may is a sure way of preventing this bug.  But if dynamic memory allocation is either necessary or convenient in your system, there is an alternative way of structuring the heap that will prevent fragmentation.  The key observation is that the problem is caused by variable sized requests.  

If all of the requests were of the same size, then any free block is as good as any other—even if it happens not to be adjacent to any of the other free blocks.  Thus it is possible to use multiple “heaps”—each for allocation requests of a specific size—can using a “memory pool” data structure.

If you like you can write your own fixed-sized memory pool API.  You’ll just need three functions:

<ul>
	<li>handle = pool_create(block_size, num_blocks) - to create a new pool (of size M chunks by N bytes);</li>
	<li>p_block = pool_alloc(handle) - to allocate one chunk (from a specified pool); and</li>
  	<li>pool_free(handle, p_block).</li>
</ul>

But note that many real-time operating systems (RTOSes) feature a fixed-size memory pool API.  If you have access to one of those, use it instead of the compiler's malloc() and free() or your own implementation.

<a href="/barr-code/2010/03/firmware-specific-bug-4-stack-overflow/">Firmware-Specific Bug #4</a>

Firmware-Specific Bug #6 (coming soon)]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/barr-code/2010/03/firmware-specific-bug-5-heap-fragmentation/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Early Hardware/Firmware Collaboration</title>
				<link>http://embeddedgurus.com/embedded-bridge/2010/03/early-hardwarefirmware-collaboration/</link>
				<comments>http://embeddedgurus.com/embedded-bridge/2010/03/early-hardwarefirmware-collaboration/#comments</comments>
				<pubDate>Sat, 13 Mar 2010 19:23:33 +0000</pubDate>
				<dc:creator>Gary Stringham</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/embedded-bridge/2010/03/early-hardwarefirmware-collaboration/</guid>
                <description><![CDATA[A few years ago, I was invited to a high-level design review of a new SoC. The proposed design contained an I/O block that needed features added. I successfully convinced them to replace that block instead with an existing block that had the needed features, had seen silicon, and had a functional driver. Months later on real hardware, bringing up the driver for that block was painless.

This incident illustrates the benefits of early collaboration between hardware and firmware engineers. All too often, though, the early phases of embedded systems projects consist solely of hardware engineers working on the design of the hardware without firmware engineers involved. This practice makes it difficult to produce designs flexible and robust enough to support features that also rely on firmware. By working together, however, hardware and firmware engineers can avoid these problems.

Early collaboration of hardware and firmware engineers (both formally and informally) gets both groups talking to each other. Each group has a different perspective of their respective needs and of the embedded system. Here are some benefits you can expect:
<ul>
	<li>Reviewing the system requirements together helps reconcile differences and produce a more comprehensive view of the requirements. This helps assure the design meets the requirements.</li>
	<li>Some features can be implemented in either hardware or firmware. Working together, engineers can discuss where the line should be drawn between hardware and firmware, taking into account trade-offs such as performance, flexibility and ease of implementation.</li>
	<li>Collaboration ensures the design of the hardware/firmware interface leads to a smoother integration of hardware and firmware into the embedded product as a whole.</li>
</ul>
Hardware and firmware engineers reap these and other benefits when they collaborate early.

<blockquote><strong>Best Practice:</strong> Start collaboration between hardware and firmware engineers during the initial hardware high-level design phase.</blockquote>

Formal hardware development processes typically include checkpoints with formal reviews and signoffs by key personnel, assuring successful completion of all criteria. Firmware participation in the hardware checkpoints encourages hardware and firmware engineers to review the documentation together and jointly resolve any issues they uncover.

<blockquote><strong>Best Practice:</strong> Include firmware representation in reviews and signoffs of hardware checkpoints throughout the life cycle.</blockquote>

Until our next collaboration...]]></description>
                <content:encoded><![CDATA[A few years ago, I was invited to a high-level design review of a new SoC. The proposed design contained an I/O block that needed features added. I successfully convinced them to replace that block instead with an existing block that had the needed features, had seen silicon, and had a functional driver. Months later on real hardware, bringing up the driver for that block was painless.

This incident illustrates the benefits of early collaboration between hardware and firmware engineers. All too often, though, the early phases of embedded systems projects consist solely of hardware engineers working on the design of the hardware without firmware engineers involved. This practice makes it difficult to produce designs flexible and robust enough to support features that also rely on firmware. By working together, however, hardware and firmware engineers can avoid these problems.

Early collaboration of hardware and firmware engineers (both formally and informally) gets both groups talking to each other. Each group has a different perspective of their respective needs and of the embedded system. Here are some benefits you can expect:
<ul>
	<li>Reviewing the system requirements together helps reconcile differences and produce a more comprehensive view of the requirements. This helps assure the design meets the requirements.</li>
	<li>Some features can be implemented in either hardware or firmware. Working together, engineers can discuss where the line should be drawn between hardware and firmware, taking into account trade-offs such as performance, flexibility and ease of implementation.</li>
	<li>Collaboration ensures the design of the hardware/firmware interface leads to a smoother integration of hardware and firmware into the embedded product as a whole.</li>
</ul>
Hardware and firmware engineers reap these and other benefits when they collaborate early.

<blockquote><strong>Best Practice:</strong> Start collaboration between hardware and firmware engineers during the initial hardware high-level design phase.</blockquote>

Formal hardware development processes typically include checkpoints with formal reviews and signoffs by key personnel, assuring successful completion of all criteria. Firmware participation in the hardware checkpoints encourages hardware and firmware engineers to review the documentation together and jointly resolve any issues they uncover.

<blockquote><strong>Best Practice:</strong> Include firmware representation in reviews and signoffs of hardware checkpoints throughout the life cycle.</blockquote>

Until our next collaboration...]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/embedded-bridge/2010/03/early-hardwarefirmware-collaboration/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Firmware-Specific Bug #4: Stack Overflow</title>
				<link>http://embeddedgurus.com/barr-code/2010/03/firmware-specific-bug-4-stack-overflow/</link>
				<comments>http://embeddedgurus.com/barr-code/2010/03/firmware-specific-bug-4-stack-overflow/#comments</comments>
				<pubDate>Thu, 11 Mar 2010 19:52:51 +0000</pubDate>
				<dc:creator>Michael Barr</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/barr-code/2010/03/firmware-specific-bug-4-stack-overflow/</guid>
                <description><![CDATA[Every programmer knows that a stack overflow is a Very Bad Thing™.  The effect of each stack overflow varies, though.  The nature of the damage and the timing of the misbehavior depend entirely on which data or instructions are clobbered and how they are used.  Importantly, the length of time between a stack overflow and its negative effects on the system depends on how long it is before the clobbered bits are used.

Unfortunately, stack overflow afflicts embedded systems far more often than it does desktop computers.  This is for several reasons, including: 
<ol>
	<li>embedded systems usually have to get by on a smaller amount of RAM;</li>

	<li>there is typically no virtual memory to fall back on (because there is no disk);</li>
 
	<li>firmware designs based on RTOS tasks utilize multiple stacks (one per task), each of which must be sized sufficiently to ensure against unique worst-case stack depth;</li>

	<li>and interrupt handlers may try to use those same stacks.</li>
</ol>

Further complicating this issue, there is no amount of testing that can ensure that a particular stack is sufficiently large.  You can test your system under all sorts of loading conditions but you can only test it for so long.  A stack overflow that only occurs “once in a blue moon” may not be witnessed by tests that run for only “half a blue moon.”  Demonstrating that a stack overflow will never occur can, under algorithmic limitations (such as no recursion), be done with a top down analysis of the control flow of the code.  But a top down analysis will need to be redone every time the code is changed.

<em>Best Practice</em>: On startup, paint an unlikely memory pattern throughout the stack(s).  (I like to use hex <code>23 3D 3D 23</code>, which looks like a fence ‘<code>#==#</code>’ in an ASCII memory dump.)  At runtime, have a supervisor task periodically check that none of the paint above some pre-established high water mark has been changed.  If something is found to be amiss with a stack, log the specific error (e.g., which stack and how high the flood) in non-volatile memory and do something safe for users of the product (e.g., controlled shut down or reset) before a true overflow can occur.  This is a nice additional safety feature to add to the watchdog task.

<a href="/barr-code/2010/02/firmware-specific-bug-3-missing-volatile-keyword/">Firmware-Specific Bug #3</a>

Firmware-Specific Bug #5 (coming soon)]]></description>
                <content:encoded><![CDATA[Every programmer knows that a stack overflow is a Very Bad Thing™.  The effect of each stack overflow varies, though.  The nature of the damage and the timing of the misbehavior depend entirely on which data or instructions are clobbered and how they are used.  Importantly, the length of time between a stack overflow and its negative effects on the system depends on how long it is before the clobbered bits are used.

Unfortunately, stack overflow afflicts embedded systems far more often than it does desktop computers.  This is for several reasons, including: 
<ol>
	<li>embedded systems usually have to get by on a smaller amount of RAM;</li>

	<li>there is typically no virtual memory to fall back on (because there is no disk);</li>
 
	<li>firmware designs based on RTOS tasks utilize multiple stacks (one per task), each of which must be sized sufficiently to ensure against unique worst-case stack depth;</li>

	<li>and interrupt handlers may try to use those same stacks.</li>
</ol>

Further complicating this issue, there is no amount of testing that can ensure that a particular stack is sufficiently large.  You can test your system under all sorts of loading conditions but you can only test it for so long.  A stack overflow that only occurs “once in a blue moon” may not be witnessed by tests that run for only “half a blue moon.”  Demonstrating that a stack overflow will never occur can, under algorithmic limitations (such as no recursion), be done with a top down analysis of the control flow of the code.  But a top down analysis will need to be redone every time the code is changed.

<em>Best Practice</em>: On startup, paint an unlikely memory pattern throughout the stack(s).  (I like to use hex <code>23 3D 3D 23</code>, which looks like a fence ‘<code>#==#</code>’ in an ASCII memory dump.)  At runtime, have a supervisor task periodically check that none of the paint above some pre-established high water mark has been changed.  If something is found to be amiss with a stack, log the specific error (e.g., which stack and how high the flood) in non-volatile memory and do something safe for users of the product (e.g., controlled shut down or reset) before a true overflow can occur.  This is a nice additional safety feature to add to the watchdog task.

<a href="/barr-code/2010/02/firmware-specific-bug-3-missing-volatile-keyword/">Firmware-Specific Bug #3</a>

Firmware-Specific Bug #5 (coming soon)]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/barr-code/2010/03/firmware-specific-bug-4-stack-overflow/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Bad Engineering… or Just Good Marketing?</title>
				<link>http://embeddedgurus.com/area-0x51/2010/03/bad-engineering%e2%80%a6-or-just-good-marketing/</link>
				<comments>http://embeddedgurus.com/area-0x51/2010/03/bad-engineering%e2%80%a6-or-just-good-marketing/#comments</comments>
				<pubDate>Mon, 08 Mar 2010 04:01:53 +0000</pubDate>
				<dc:creator>Mike Ficco</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/area-0x51/2010/03/bad-engineering%e2%80%a6-or-just-good-marketing/</guid>
                <description><![CDATA[The other day I went to the gas station and encountered a shiny new pump.  It had a liquid crystal display to interact with me and on top of the pump was a medium sized monitor.  My first thought was "wow – nice", but that first impression didn't last very long.  I was a little disappointed with the responsiveness of the display and authorization of my credit card seemed to take longer than usual.  I began thinking the company had not gotten their money's worth from the engineering department.

My thinking soon degenerated into unkind thoughts as it took FOREVER to fill my car.  It was not just the electronics that was sluggish.  The gasoline was being dispensed at an intolerably slow rate.

Finally, my car was full.  As I removed the nozzle from the car I half-noticed that the advertisements that were droning on the monitor had started over.  I don't go to gas stations to be entertained or be educated and I sure don't go there to watch advertisements.  As I was hanging up the pump the conspiracy theory part of my brain became active.  Could it be that the slowness of the pump was not the result of bad engineering but of the engineering team adhering to marketing requirements to delay the customer?  Was I purposefully held captive at the pump so they could show me more advertisements?

My feelings of pity for the company turned to dismay.  I no longer thought the company spent good money to get an engineering piece of junk.  Rather, the engineers did a fine job on this pump and built it just like the marketing department wanted it.

Am I being more paranoid and delusional than normal?  I don't think so.  We are fortunate to live in a time when the raw power of economical hardware and the sophistication of software can produce miracles.  Unfortunately, too few products and services get to flex their technological muscles without being crippled in some fashion.  We are certainly on the verge of a "Brave New World", but I can't tell if this new world will be wonderful or horrible.  Will we have unbelievable health and convenience?  Will we have knowledge and facts at our finger tips, or will facts cease to exist as our every interaction with the world is flavored and colored by the whims of those who control the information outlets and connections?

I fear it is possible that some of the scenes of Steven Spielberg's <em>Minority Report</em> will be proven uncomfortably prescient and we will be fed a non-stop deluge of individualized advertisements, polls, and surveys.  Out time will be subtly consumed as customer service lines force us to wait as they are ALWAYS experiencing "unusually high call volume".  While we are waiting we will be told of their latest sales and how much they value our business.

The machines that do all of this, like the <em>Skynet Terminators</em>, will feel no remorse and no sympathy.  They will simply do what we, the engineers, designed them to do.

… OH!  The Shame!!]]></description>
                <content:encoded><![CDATA[The other day I went to the gas station and encountered a shiny new pump.  It had a liquid crystal display to interact with me and on top of the pump was a medium sized monitor.  My first thought was "wow – nice", but that first impression didn't last very long.  I was a little disappointed with the responsiveness of the display and authorization of my credit card seemed to take longer than usual.  I began thinking the company had not gotten their money's worth from the engineering department.

My thinking soon degenerated into unkind thoughts as it took FOREVER to fill my car.  It was not just the electronics that was sluggish.  The gasoline was being dispensed at an intolerably slow rate.

Finally, my car was full.  As I removed the nozzle from the car I half-noticed that the advertisements that were droning on the monitor had started over.  I don't go to gas stations to be entertained or be educated and I sure don't go there to watch advertisements.  As I was hanging up the pump the conspiracy theory part of my brain became active.  Could it be that the slowness of the pump was not the result of bad engineering but of the engineering team adhering to marketing requirements to delay the customer?  Was I purposefully held captive at the pump so they could show me more advertisements?

My feelings of pity for the company turned to dismay.  I no longer thought the company spent good money to get an engineering piece of junk.  Rather, the engineers did a fine job on this pump and built it just like the marketing department wanted it.

Am I being more paranoid and delusional than normal?  I don't think so.  We are fortunate to live in a time when the raw power of economical hardware and the sophistication of software can produce miracles.  Unfortunately, too few products and services get to flex their technological muscles without being crippled in some fashion.  We are certainly on the verge of a "Brave New World", but I can't tell if this new world will be wonderful or horrible.  Will we have unbelievable health and convenience?  Will we have knowledge and facts at our finger tips, or will facts cease to exist as our every interaction with the world is flavored and colored by the whims of those who control the information outlets and connections?

I fear it is possible that some of the scenes of Steven Spielberg's <em>Minority Report</em> will be proven uncomfortably prescient and we will be fed a non-stop deluge of individualized advertisements, polls, and surveys.  Out time will be subtly consumed as customer service lines force us to wait as they are ALWAYS experiencing "unusually high call volume".  While we are waiting we will be told of their latest sales and how much they value our business.

The machines that do all of this, like the <em>Skynet Terminators</em>, will feel no remorse and no sympathy.  They will simply do what we, the engineers, designed them to do.

… OH!  The Shame!!]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/area-0x51/2010/03/bad-engineering%e2%80%a6-or-just-good-marketing/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Welcome to the new stack-overflow!</title>
				<link>http://embeddedgurus.com/stack-overflow/2010/03/welcome-to-the-new-stack-overflow/</link>
				<comments>http://embeddedgurus.com/stack-overflow/2010/03/welcome-to-the-new-stack-overflow/#comments</comments>
				<pubDate>Wed, 03 Mar 2010 13:19:13 +0000</pubDate>
				<dc:creator>Nigel Jones</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/stack-overflow/2010/03/welcome-to-the-new-stack-overflow/</guid>
                <description><![CDATA[Regular visitors will no doubt have noticed a rather dramatic change to the visual appearance of this blog. EmbeddedGurus has grown dramatically in the last year and so we've moved to a better platform (Wordpress) to manage the growth. Although the switch over from Blogger has been relatively painless, it's still necessary for me to manually check all my previous posts making sure they are right. I should have this done in the next few days at which point I will resume regular blogging.

If you've posted a comment to my blog in the last week or so it may not have made the transition - for which I apologize.
<a href="http://www.embeddedgurus.com/stack-overflow/">Home</a>]]></description>
                <content:encoded><![CDATA[Regular visitors will no doubt have noticed a rather dramatic change to the visual appearance of this blog. EmbeddedGurus has grown dramatically in the last year and so we've moved to a better platform (Wordpress) to manage the growth. Although the switch over from Blogger has been relatively painless, it's still necessary for me to manually check all my previous posts making sure they are right. I should have this done in the next few days at which point I will resume regular blogging.

If you've posted a comment to my blog in the last week or so it may not have made the transition - for which I apologize.
<a href="http://www.embeddedgurus.com/stack-overflow/">Home</a>]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/stack-overflow/2010/03/welcome-to-the-new-stack-overflow/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Embedded Gurus - Site Redesign</title>
				<link>http://embeddedgurus.com/barr-code/2010/03/embedded-gurus-site-redesign/</link>
				<comments>http://embeddedgurus.com/barr-code/2010/03/embedded-gurus-site-redesign/#comments</comments>
				<pubDate>Tue, 02 Mar 2010 21:20:27 +0000</pubDate>
				<dc:creator>Michael Barr</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/barr-code/2010/03/embedded-gurus-site-redesign/</guid>
                <description><![CDATA[I am pleased to announce that the <a href="http://www.embeddedgurus.com">EmbeddedGurus</a> website has been redesigned.  Among the new features of the site are:

1.  A dynamically updating home page, featuring the most recent posts from all of our bloggers.  If you prefer, you may view these posts <a href="/categories">by category</a>.

2.  A common look and feel to all of the individual blogs.

3.  The ability to search individual blogs, as well as to easily browse from one post to the next and via tags and categories.

4.  A sixth guru named <a href="/gurus/gary-stringham">Gary Stringham</a> with a blog called <a href="/embedded-bridge/">Embedded Bridge</a>.

A number of other minor improvements have also been made.

We hope you like the new look and continue to find our blogs about embedded systems design both readable and informative. ]]></description>
                <content:encoded><![CDATA[I am pleased to announce that the <a href="http://www.embeddedgurus.com">EmbeddedGurus</a> website has been redesigned.  Among the new features of the site are:

1.  A dynamically updating home page, featuring the most recent posts from all of our bloggers.  If you prefer, you may view these posts <a href="/categories">by category</a>.

2.  A common look and feel to all of the individual blogs.

3.  The ability to search individual blogs, as well as to easily browse from one post to the next and via tags and categories.

4.  A sixth guru named <a href="/gurus/gary-stringham">Gary Stringham</a> with a blog called <a href="/embedded-bridge/">Embedded Bridge</a>.

A number of other minor improvements have also been made.

We hope you like the new look and continue to find our blogs about embedded systems design both readable and informative. ]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/barr-code/2010/03/embedded-gurus-site-redesign/feed/</wfw:commentRss>
			</item>
						<item>
				<title>Welcome to the Embedded Bridge Blog</title>
				<link>http://embeddedgurus.com/embedded-bridge/2010/03/welcome-to-the-embedded-bridge-blog/</link>
				<comments>http://embeddedgurus.com/embedded-bridge/2010/03/welcome-to-the-embedded-bridge-blog/#comments</comments>
				<pubDate>Tue, 02 Mar 2010 14:47:32 +0000</pubDate>
				<dc:creator>Gary Stringham</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/embedded-bridge/2010/03/welcome-to-the-embedded-bridge-blog/</guid>
                <description><![CDATA[Welcome to the new Embedded Bridge blog on the newly-redesigned Embedded Gurus website. So why the word, "bridge"? It is not used in the literal sense of bridges that carries human or vehicular traffic. (Although I came across this embedded bridge in <a title="Inc Magazine" href="http://www.inc.com/rollovers/bridge/index.html" target="_blank">Inc Magazine</a>. Roll the mouse over numbers 1 and 4.)

I used "bridge" in the conceptual sense to enable two sides to work together, bridging the gap between the two, and establishing processes, procedures, and protocols between the two. I have written much (<a title="book" href="http://garystringham.com/hwfwbook" target="_blank">book</a> and <a title="newsletters" href="http://garystringham.com/newsletter" target="_blank">newsletters</a>) about hardware and firmware (a.k.a. embedded software), both in terms of their respective engineers and in terms of the design of the interface between hardware and firmware. But I also like to discuss bridging the gap between any two other entities, such as management and engineers or the end user and their embedded systems device.

In addition I'll post other musings in the embedded and engineering space. Until next time ...]]></description>
                <content:encoded><![CDATA[Welcome to the new Embedded Bridge blog on the newly-redesigned Embedded Gurus website. So why the word, "bridge"? It is not used in the literal sense of bridges that carries human or vehicular traffic. (Although I came across this embedded bridge in <a title="Inc Magazine" href="http://www.inc.com/rollovers/bridge/index.html" target="_blank">Inc Magazine</a>. Roll the mouse over numbers 1 and 4.)

I used "bridge" in the conceptual sense to enable two sides to work together, bridging the gap between the two, and establishing processes, procedures, and protocols between the two. I have written much (<a title="book" href="http://garystringham.com/hwfwbook" target="_blank">book</a> and <a title="newsletters" href="http://garystringham.com/newsletter" target="_blank">newsletters</a>) about hardware and firmware (a.k.a. embedded software), both in terms of their respective engineers and in terms of the design of the interface between hardware and firmware. But I also like to discuss bridging the gap between any two other entities, such as management and engineers or the end user and their embedded systems device.

In addition I'll post other musings in the embedded and engineering space. Until next time ...]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/embedded-bridge/2010/03/welcome-to-the-embedded-bridge-blog/feed/</wfw:commentRss>
			</item>
						<item>
				<title>The Challenge of Debugging Cache Coherency Problems</title>
				<link>http://embeddedgurus.com/barr-code/2010/02/the-challenge-of-debugging-cache-coherency-problems/</link>
				<comments>http://embeddedgurus.com/barr-code/2010/02/the-challenge-of-debugging-cache-coherency-problems/#comments</comments>
				<pubDate>Fri, 19 Feb 2010 16:18:00 +0000</pubDate>
				<dc:creator>Michael Barr</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/barr-code/2010/02/the-challenge-of-debugging-cache-coherency-problems/</guid>
                <description><![CDATA[<p>The following is an example of a cache-related embedded software bug that is a real challenge to solve for several reasons, not the least of which is the fact that the actual problem was masked in the debugger's view of memory.</p>
<blockquote><p>One nasty bug that came up recently for us was the realization that we were not flushing the instruction cache after leaving the bootloader which had a very confusing effect when running our application. In our design our code pretty much runs out of flash. Our bootloader is in the lowest part of flash and our 2 images sit in their own higher memory ranges of flash. So we never realized we should do this.</p>
<p>Well, we had to copy a small piece of code into RAM for the purpose of allowing firmware upgrades to be written to flash. This piece of code would be executing when the actual erases and writes took place (i.e. we couldn't execute from AND write to flash at the same time). This code would get copied out of flash both when the bootloader started execution AND when the image would start execution because they shared the startup code that we inherited from a board development kit (BDK).</p>
<p>Another thing we didn't realize was that the RAM code optimized differently for the bootloader image and the application images. The end result is that the instruction cache would in certain cases have a hit and return the wrong instructions for us. For instance, when we tried to perform an upgrade while running from our image, it would erase a completely different area of flash than we intended. To make things somewhat more confusing, it did NOT help to step through the code using the debugger. The debugger was not showing us that the instruction cache was providing different lines of code than the lines of source it was showing.</p>
<p>This was ultimately one of the more frustrating bugs we have chased recently. Imagine the confusion when sometimes a firmware upgrade would work fine and other times it would completely brick your board (they could be salvaged with a JTAG programmer at least).</p></blockquote>
<p>Thanks to Richard von Lehe of <a href="http://www.starkey.com">Starkey Labs</a> for sharing this.</p>]]></description>
                <content:encoded><![CDATA[<p>The following is an example of a cache-related embedded software bug that is a real challenge to solve for several reasons, not the least of which is the fact that the actual problem was masked in the debugger's view of memory.</p>
<blockquote><p>One nasty bug that came up recently for us was the realization that we were not flushing the instruction cache after leaving the bootloader which had a very confusing effect when running our application. In our design our code pretty much runs out of flash. Our bootloader is in the lowest part of flash and our 2 images sit in their own higher memory ranges of flash. So we never realized we should do this.</p>
<p>Well, we had to copy a small piece of code into RAM for the purpose of allowing firmware upgrades to be written to flash. This piece of code would be executing when the actual erases and writes took place (i.e. we couldn't execute from AND write to flash at the same time). This code would get copied out of flash both when the bootloader started execution AND when the image would start execution because they shared the startup code that we inherited from a board development kit (BDK).</p>
<p>Another thing we didn't realize was that the RAM code optimized differently for the bootloader image and the application images. The end result is that the instruction cache would in certain cases have a hit and return the wrong instructions for us. For instance, when we tried to perform an upgrade while running from our image, it would erase a completely different area of flash than we intended. To make things somewhat more confusing, it did NOT help to step through the code using the debugger. The debugger was not showing us that the instruction cache was providing different lines of code than the lines of source it was showing.</p>
<p>This was ultimately one of the more frustrating bugs we have chased recently. Imagine the confusion when sometimes a firmware upgrade would work fine and other times it would completely brick your board (they could be salvaged with a JTAG programmer at least).</p></blockquote>
<p>Thanks to Richard von Lehe of <a href="http://www.starkey.com">Starkey Labs</a> for sharing this.</p>]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/barr-code/2010/02/the-challenge-of-debugging-cache-coherency-problems/feed/</wfw:commentRss>
			</item>
						<item>
				<title>So you want to be an independent contractor?</title>
				<link>http://embeddedgurus.com/stack-overflow/2010/02/so-you-want-to-be-an-independent-contractor/</link>
				<comments>http://embeddedgurus.com/stack-overflow/2010/02/so-you-want-to-be-an-independent-contractor/#comments</comments>
				<pubDate>Fri, 19 Feb 2010 14:19:00 +0000</pubDate>
				<dc:creator>Nigel Jones</dc:creator>
				<guid isPermaLink="false">http://embeddedgurus.com/stack-overflow/2010/02/so-you-want-to-be-an-independent-contractor/</guid>
                <description><![CDATA[Today's post is motivated by the <a href="http://www.cnn.com/2010/US/02/18/texas.plane.crash/index.html?hpt=T1">events </a>that happened yesterday in Austin, Texas. For my overseas visitors, a software engineer, Joe Stack, decided to fly his light aircraft into an office building that housed the regional offices of the IRS (the American tax office). He created tremendous damage and likely murdered at least one person, while killing himself. Notwithstanding that I <a href="http://embeddedgurus.com/stack-overflow/2009/12/terrorist-engineers/">wrote </a>just a few weeks ago about the propensity for engineers to be involved in terrorist acts, what is relevant about this news item is that it appears that Joe Stack's principal complaint concerned a portion of the<a href="http://taxfoundation.org/blog/show/25870.html"> US tax code</a> (via <a href="http://www.salon.com/news/joe_stack/index.html?story=/tech/htww/2010/02/18/joe_stack_s_tax_problem">Andrew Leonard</a>) that applies almost uniquely to consultants / independent contractors in the software / firmware field.

So while this isn't a tax advice blog, I thought I'd weigh in on the issue, since it's something that applies to me, and indeed anyone thinking of becoming a consultant / independent contractor in the USA.

The main issue revolves around who is an employee and who is an independent contractor. From a tax perspective this is an important distinction, because companies can avoid a lot of overhead by classifying employees as independent contractors. For example, employers avoid paying the employer contribution to social security, which for a typical engineer in the USA was around US$7000 per person in 2009. Instead the independent contractor is responsible for this payment. Conversely, independent contractors get some benefits that employees do not. For example an independent contractor can normally deduct from his taxable income the cost of travel to and from a client's office.

Now whether one prefers employee or independent status is of course a matter of income levels and personal preference. However, what is crucial is that one not fall some where in the middle - because if you do you stand the risk of being re-classified by the IRS - at which point the tax bills can start getting very large for everyone involved. This falling in the middle tends to occur when someone is classified as an independent by the company - but acts like an employee. That is they work the same hours, and do the same work at the same time in the same location as someone who is an employee. If this describes you, or it describes a job that you are considering, then I suggest you read on.

Note that in the following, I have assumed that you want to be an independent contractor. If you are classified this way and instead want to be an employee, then do the opposite of what is advised!

<strong>Time</strong>
An independent contractor must be free to set their own work hours. Although it is OK for an organization to say you can't work, e.g. after 9 pm or before 6 am, it is not OK for them to specify your exact work hours. Furthermore, it is important that you exercise this right. For example if you are required to be on site 40 hours a week, then to preserve your independent contractor status it would be smart to work e.g. four 10 hour days, rather than the normal five 8 hour days. I also recommend that you strive to get the right to work from your home office for a certain percentage of the time. This helps establish your home office as a bona fide work place while simultaneously bolstering your independent status.

<strong>Tools</strong>
An independent contractor is normally expected to provide their own tools. Now clearly you are unlikely to own a $25,000 spectrum analyzer. However as an independent contractor it is certainly reasonable that you provide your own computer and other tools such as compilers, email clients etc. The problem with this is that it often clashes with a companies IT policy. When this happens I strongly suggest that you sit down with the various parties (HR, IT, your recruiting manager etc) to address the issue. There are various options available, but the bottom line is you need to protect your status - and the company (if it's on the ball) will want to do the same.

<strong>Multiple Clients</strong>
The final way in which I handle this issue is by having multiple concurrent clients. This not only helps you meet the time requirement, but it also strongly reinforces the fact that you are free to work for whom you want, when you want - almost the definition of an independent contractor.

Well that's my practical guide to not falling afoul of the IRS rules. Hopefully for those of you contemplating going into the consulting business you'll have found it useful.

I'll be returning to my more normal fare with my next post. As a heads up, embedded-gurus is undergoing a major face-lift over the next few weeks, which may impact not only my posting schedule, but also all the other bloggers here.
<a href="http://www.embeddedgurus.com/stack-overflow/">Home</a>]]></description>
                <content:encoded><![CDATA[Today's post is motivated by the <a href="http://www.cnn.com/2010/US/02/18/texas.plane.crash/index.html?hpt=T1">events </a>that happened yesterday in Austin, Texas. For my overseas visitors, a software engineer, Joe Stack, decided to fly his light aircraft into an office building that housed the regional offices of the IRS (the American tax office). He created tremendous damage and likely murdered at least one person, while killing himself. Notwithstanding that I <a href="http://embeddedgurus.com/stack-overflow/2009/12/terrorist-engineers/">wrote </a>just a few weeks ago about the propensity for engineers to be involved in terrorist acts, what is relevant about this news item is that it appears that Joe Stack's principal complaint concerned a portion of the<a href="http://taxfoundation.org/blog/show/25870.html"> US tax code</a> (via <a href="http://www.salon.com/news/joe_stack/index.html?story=/tech/htww/2010/02/18/joe_stack_s_tax_problem">Andrew Leonard</a>) that applies almost uniquely to consultants / independent contractors in the software / firmware field.

So while this isn't a tax advice blog, I thought I'd weigh in on the issue, since it's something that applies to me, and indeed anyone thinking of becoming a consultant / independent contractor in the USA.

The main issue revolves around who is an employee and who is an independent contractor. From a tax perspective this is an important distinction, because companies can avoid a lot of overhead by classifying employees as independent contractors. For example, employers avoid paying the employer contribution to social security, which for a typical engineer in the USA was around US$7000 per person in 2009. Instead the independent contractor is responsible for this payment. Conversely, independent contractors get some benefits that employees do not. For example an independent contractor can normally deduct from his taxable income the cost of travel to and from a client's office.

Now whether one prefers employee or independent status is of course a matter of income levels and personal preference. However, what is crucial is that one not fall some where in the middle - because if you do you stand the risk of being re-classified by the IRS - at which point the tax bills can start getting very large for everyone involved. This falling in the middle tends to occur when someone is classified as an independent by the company - but acts like an employee. That is they work the same hours, and do the same work at the same time in the same location as someone who is an employee. If this describes you, or it describes a job that you are considering, then I suggest you read on.

Note that in the following, I have assumed that you want to be an independent contractor. If you are classified this way and instead want to be an employee, then do the opposite of what is advised!

<strong>Time</strong>
An independent contractor must be free to set their own work hours. Although it is OK for an organization to say you can't work, e.g. after 9 pm or before 6 am, it is not OK for them to specify your exact work hours. Furthermore, it is important that you exercise this right. For example if you are required to be on site 40 hours a week, then to preserve your independent contractor status it would be smart to work e.g. four 10 hour days, rather than the normal five 8 hour days. I also recommend that you strive to get the right to work from your home office for a certain percentage of the time. This helps establish your home office as a bona fide work place while simultaneously bolstering your independent status.

<strong>Tools</strong>
An independent contractor is normally expected to provide their own tools. Now clearly you are unlikely to own a $25,000 spectrum analyzer. However as an independent contractor it is certainly reasonable that you provide your own computer and other tools such as compilers, email clients etc. The problem with this is that it often clashes with a companies IT policy. When this happens I strongly suggest that you sit down with the various parties (HR, IT, your recruiting manager etc) to address the issue. There are various options available, but the bottom line is you need to protect your status - and the company (if it's on the ball) will want to do the same.

<strong>Multiple Clients</strong>
The final way in which I handle this issue is by having multiple concurrent clients. This not only helps you meet the time requirement, but it also strongly reinforces the fact that you are free to work for whom you want, when you want - almost the definition of an independent contractor.

Well that's my practical guide to not falling afoul of the IRS rules. Hopefully for those of you contemplating going into the consulting business you'll have found it useful.

I'll be returning to my more normal fare with my next post. As a heads up, embedded-gurus is undergoing a major face-lift over the next few weeks, which may impact not only my posting schedule, but also all the other bloggers here.
<a href="http://www.embeddedgurus.com/stack-overflow/">Home</a>]]></content:encoded>
				<wfw:commentRss>http://embeddedgurus.com/stack-overflow/2010/02/so-you-want-to-be-an-independent-contractor/feed/</wfw:commentRss>
			</item>
			</channel>
</rss>
