<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Barr Code &#187; architecture</title>
	<atom:link href="http://embeddedgurus.com/barr-code/tag/architecture/feed/" rel="self" type="application/rss+xml" />
	<link>http://embeddedgurus.com/barr-code</link>
	<description>A Blog by Michael Barr</description>
	<lastBuildDate>Wed, 18 Apr 2012 19:33:32 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Building Reliable and Secure Embedded Systems</title>
		<link>http://embeddedgurus.com/barr-code/2012/03/building-reliable-and-secure-embedded-systems/</link>
		<comments>http://embeddedgurus.com/barr-code/2012/03/building-reliable-and-secure-embedded-systems/#comments</comments>
		<pubDate>Tue, 13 Mar 2012 10:49:45 +0000</pubDate>
		<dc:creator>Michael Barr</dc:creator>
				<category><![CDATA[Firmware Bugs]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[ethics]]></category>
		<category><![CDATA[firmware]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[realtime]]></category>
		<category><![CDATA[safety]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[standards]]></category>
		<category><![CDATA[trends]]></category>

		<guid isPermaLink="false">http://embeddedgurus.com/barr-code/?p=710</guid>
		<description><![CDATA[In this era of 140 characters or less, it has been well and concisely stated that, &#8220;RELIABILITY concerns ACCIDENTAL errors causing failures, whereas SECURITY concerns INTENTIONAL errors causing failures.&#8221; In this column I expand on this statement, especially as regards the design of embedded systems and their place in our network-connected and safety-concious modern world. [...]]]></description>
			<content:encoded><![CDATA[<p><em>In this era of 140 characters or less, it has been well and concisely stated that, &#8220;RELIABILITY concerns ACCIDENTAL errors causing failures, whereas SECURITY concerns INTENTIONAL errors causing failures.&#8221;  In this column I expand on this statement, especially as regards the design of embedded systems and their place in our network-connected and safety-concious modern world.</em></p>
<p>As the designers of embedded systems, the first thing we must accomplish on any project is to make the hardware and software work.  That is to say we need to make the system behave as it was designed to.  The first iteration of this is often flaky; certain uses or perturbations of the system by testers can easily dislodge the system into a non-working state.  In common parlance, &#8220;expect bugs.&#8221;  </p>
<p>Given time, tightening cycles of debug and test can get us past the bugs and through to a shippable product.  But is a debugged system good enough?  Neither reliability nor security can be tested into a product.  Each must be designed in from the start.  So let&#8217;s take a closer look at these two important design aspects for modern embedded systems and then I&#8217;ll bring them back together at the end.</p>
<p><strong>Reliable Embedded Systems</strong></p>
<p>A product can be stable yet lack reliability.  Consider, for example, an anti-lock braking computer installed in a car.  The software in the anti-lock brakes may be bug-free, but how does it function if a critical input sensor fails?</p>
<p>Reliable systems are robust in the face of adverse run-time environments.  Reliable systems are able to work around errors encountered as they occur to the system in the field&#8211;so that the number and impact of failures are minimized.  One key strategy for building reliable systems is to eliminate single-points-of-failure.  For example, redundancy could be added around that critical input sensor&#8211;perhaps by adding a second sensor in parallel with the first.</p>
<p>Another aspect of reliability that is under the complete control of designers (at least when they consider it from the start) are the &#8220;fail-safe&#8221; mechanisms.  Perhaps a suitable but lower-cost alternative to a redundant sensor is detection of the failed sensor with a fall back to mechanical braking.</p>
<p><a href="http://en.wikipedia.org/wiki/Failure_mode_and_effects_analysis" title="Failure Modes and Effects Analysis" target="_blank">Failure Mode and Effect Analysis</a> (FMEA) is one of the most effective and important design processes used by engineers serious about designing reliability into their systems.  Following this process, each possible failure point is traced from the root failure outward to its effects.  In an FMEA, numerical weights can be applied to the likelihoods of each failure as well as the seriousness of consequences.  An FMEA can thus help guide you to a cost effective but higher reliability design by highlighting the most valuable places to insert the redundancy, fail-safes, or other elements that reinforce the system&#8217;s overall reliability.</p>
<p>In certain industries, reliability is a key driver of product safety.  And that is why you see these techniques and FMEA and other design for reliability processes being applied by the designers of safety-critical automotive, medical, avionics, nuclear, and industrial systems.  The same techniques can, of course, be used to make any type of embedded system more reliable.</p>
<p>Regardless of your industry, it is typically difficult or impossible to make your product as reliable via patches.  There&#8217;s no way to add hardware like that redundant sensor, so your options may reduce to a fail-safe that is helpful but less reliable overall.  Reliability cannot be patched or tested or debugged into your system.  Rather, reliability must be designed in from the start.</p>
<p><strong>Secure Embedded Systems</strong></p>
<p>A product can also be stable yet lack security.  For example, an office printer is the kind of product most of us purchase and use without giving a minute of thought to security.  The software in the printer may be bug-free, but is it able to prevent a would-be eavesdropper from capturing a remote electronic copy of everything you print, including your sensitive financial documents?</p>
<p>Secure systems are robust in the face of persistent attack.  Secure systems are able to keep hackers out by design.  One key strategy for building secure systems is to validate all inputs, especially those arriving over an open network connection.  For example, security could be added to a printer by ensuring against buffer overflows and encrypting and digitally signing firmware updates.</p>
<p>One of the unfortunate facts of designing secure embedded systems is that the hackers who want to get in only need to find and exploit a single weakness.  Adding layers of security is good, but if even any one of those layers remains fundamentally weak, a sufficiently motivated attacker will eventually find and breach that defense.  But that&#8217;s not an excuse for not trying.</p>
<p>For years, the largest printer maker in the world apparently gave little thought to the security of the firmware in its home/office printers, even as it was putting tens of millions of tempting targets out into the world.  Now <a href="http://events.ccc.de/congress/2011/Fahrplan/track/Hacking/4780.en.html" title="Print Me If You Dare" target="_blank">the security of those printers has been breached by security researchers</a> with a reasonable awareness of embedded systems design.  Said one of the lead researchers, &#8220;We can actually modify the firmware of the printer as part of a legitimate document. It renders correctly, and at the end of the job there&#8217;s a firmware update. &#8230; In a super-secure environment where there&#8217;s a firewall and no access &#8212; the government, Wall Street &#8212; you could send a résumé to print out.&#8221;</p>
<p>Security is a brave new world for many embedded systems designers.  For decades we have relied on the fact that the microcontrollers and Flash memory and real-time operating systems and other less mainstream technologies we use will protect our products from attack.  Or that we can gain enough &#8220;security by obscurity&#8221; by keeping our communications protocols and firmware upgrade processes secret.  But we no longer live in that world.  You must adapt.</p>
<p>Consider the implications of an <a href="http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5504804" title="Experimental Security Analysis of a Modern Automobile" target="_blank">insecure design of an automotive safety system that is connected to another Internet-connected computer in the car via CAN</a>;  or the insecure design of an <a href="http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4531149&amp;isnumber=4531132" title="Pacemakers and Implantable Cardiac Defibrillators: Software Radio Attacks and Zero-Power Defenses" target="_blank">implanted medical device</a>; or the insecure design of <em>your</em> product.</p>
<p>Too often, the ability to upgrade a product&#8217;s firmware in the field is the very vector that&#8217;s used to attack.  This can happen even when a primary purpose for including remote firmware updates is motivated by security.  For example, as I&#8217;ve learned in my work as an <a href="http://netrino.com/images/cv-barr.pdf" title="Expert Witness Resume" target="_blank">expert witness</a> in numerous cases involving reverse engineering of the techniques and technology of satellite television piracy, much of that piracy has been empowered by the same software patching mechanism that allowed the broadcasters to perform security upgrades and electronic countermeasures.  Ironically, had the security smart cards in those set-top boxes had only masked ROM images the overall system security may have been higher.  This was certainly not what the designers of the system had in mind.  But security is also an arms race.</p>
<p>Like reliability, security must be designed in from the start.  Security can&#8217;t be patched or tested or debugged in.  You simply can&#8217;t add security as effectively once the product ships.  For example, an attacker who wished to exploit a current weakness in your office printer or smart card might download his hack software into your device and write-protect his sectors of the flash today so that his code could remain resident even as you applied security patches.</p>
<p><strong>Reliable and Secure Embedded Systems</strong></p>
<p>It is important to note at this point that reliable systems are inherently more secure.  And that, vice versa, secure systems are inherently more reliable.  So, although, design for reliability and design for security will often individually yield different results&#8211;there is also an overlap between them.  </p>
<p>An investment in reliability, for example, generally pays off in security.  Why?  Well, because a more reliable system is more robust in its handling of all errors, whether they are accidental or intentional.  An anti-lock braking system with a fall back to mechanical braking for increased reliability is also more secure against an attack against that critical hardware input sensor.  Similarly, those printers wouldn&#8217;t be at risk of fuser-induced fire in the case of a security breach if they were never at risk of fire in the case of any misbehavior of the software.</p>
<p>Consider, importantly, that one of the first things a hacker intent on breaching the security of your embedded device might do is to perform a (mental, at least) <a href="http://www.fault-tree.net/papers/ericson-fta-tutorial.pdf" title="Fault Tree Analysis" target="_blank">fault tree analysis</a> of your system.  This attacker would then target her time, talents, and other resources at one or more single points of failure she considers most likely to fail in a useful way.  </p>
<p>Because a fault tree analysis starts from the general goal and works inward deductively toward the identification of one or more choke points that might produce the desired erroneous outcome, attention paid to increasing reliability such as via FMEA usually reduces choke points and makes the attacker&#8217;s job considerably more difficult.  Where security can break down even in a reliable system is where the possibility of an attacker&#8217;s intentionally induced failure is ignored in the FMEA weighting and thus possible layers of protection are omitted.</p>
<p>Similarly, an investment in security may pay off in greater reliability&#8211;even without a directed focus on reliability.  For example, if you secure your firmware upgrade process to accept only encrypted and digitally signed binary images you&#8217;ll be adding a layer of protection against an inadvertently corrupted binary causing an accidental error and product failure.  Anything you do to improve the security of communications (i.e., checksums, prevention of buffer overflows, etc.) can have a similar effect on reliability.</p>
<p><strong>The Only Way Forward</strong></p>
<p>Each year it becomes increasingly important for all of us in the embedded systems design community to learn to design reliable and secure products.  If you don&#8217;t, it might be your product making the wrong kind of headlines and your source code and design documents being poured over by lawyers.  It is no longer acceptable to stick your head in the sand on these issues.</p>
]]></content:encoded>
			<wfw:commentRss>http://embeddedgurus.com/barr-code/2012/03/building-reliable-and-secure-embedded-systems/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Embedded Software Training in a Box</title>
		<link>http://embeddedgurus.com/barr-code/2011/05/embedded-software-training-in-a-box/</link>
		<comments>http://embeddedgurus.com/barr-code/2011/05/embedded-software-training-in-a-box/#comments</comments>
		<pubDate>Fri, 06 May 2011 15:40:02 +0000</pubDate>
		<dc:creator>Michael Barr</dc:creator>
				<category><![CDATA[Coding Standards]]></category>
		<category><![CDATA[Efficient C/C++]]></category>
		<category><![CDATA[Firmware Bugs]]></category>
		<category><![CDATA[RTOS Multithreading]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[education]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[realtime]]></category>
		<category><![CDATA[rtos]]></category>

		<guid isPermaLink="false">http://embeddedgurus.com/barr-code/?p=614</guid>
		<description><![CDATA[I am beaming with pride. I think we have finally achieved the holy grail of firmware training: Embedded Software Training in a Box. Priced at just $599, the kit includes Everything-You-Need-to-Know-to-Develop-Quality-Reliable-Firmware-in-C, including software for real-time safety-critical systems such as medical devices. In many ways, this product is the culmination of about the last fifteen years [...]]]></description>
			<content:encoded><![CDATA[<p><img align="right" src="http://netrino.com/images/BootCampKit.jpg" alt="Embedded Software Training in a Box" />I am beaming with pride. I think we have finally achieved the holy grail of firmware training: <a href="http://netrino.com/Boot-Camp-Box">Embedded Software Training in a Box</a>. Priced at just $599, the kit includes Everything-You-Need-to-Know-to-Develop-Quality-Reliable-Firmware-in-C, including software for real-time safety-critical systems such as medical devices.</p>
<p>In many ways, this product is the culmination of about the last fifteen years of my career. The knowledge and skills imparted in the kit are drawn from my varied experiences as:</p>
<ul>
<li>An embedded software developer working on real products from consumer electronics to medical devices,</li>
<li>An author (you get copies of <a href="http://netrino.com/Embedded-Systems/Books">all three of my books</a> and the most relevant of <a href="http://netrino.com/Embedded-Systems/How-To">my sixty-odd articles</a>),</li>
<li>As a speaker at the <a href="http://esc.eetimes.com">Embedded Systems Conferences</a> since 1998,</li>
<li>Developer of the <a href="http://netrino.com/Boot-Camp">Embedded Software Boot Camp</a> training materials,</li>
<li>As a former technical editor and editor-in-chief of <a href="http://embedded.com">Embedded Systems Design</a> magazine.</li>
</ul>
<p>This kit also&#8211;at long last&#8211;answers the question I&#8217;ve been receiving from around the world since I first started writing articles and books about embedded programming: &#8220;Where/How can I learn to be a great embedded programmer?&#8221; I believe the answer is now as easy as: &#8220;<a href="http://netrino.com/Boot-Camp-Box">Embedded Software Boot Camp in a Box</a>!&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://embeddedgurus.com/barr-code/2011/05/embedded-software-training-in-a-box/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>What NHTSA/NASA Didn&#8217;t Consider re: Toyota&#8217;s Firmware</title>
		<link>http://embeddedgurus.com/barr-code/2011/03/what-nhtsanasa-didnt-consider-re-toyotas-firmware/</link>
		<comments>http://embeddedgurus.com/barr-code/2011/03/what-nhtsanasa-didnt-consider-re-toyotas-firmware/#comments</comments>
		<pubDate>Wed, 02 Mar 2011 23:10:54 +0000</pubDate>
		<dc:creator>Michael Barr</dc:creator>
				<category><![CDATA[Coding Standards]]></category>
		<category><![CDATA[Firmware Bugs]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[engineering]]></category>
		<category><![CDATA[ethics]]></category>
		<category><![CDATA[firmware]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[realtime]]></category>
		<category><![CDATA[rtos]]></category>
		<category><![CDATA[safety]]></category>
		<category><![CDATA[standards]]></category>
		<category><![CDATA[trends]]></category>

		<guid isPermaLink="false">http://embeddedgurus.com/barr-code/?p=548</guid>
		<description><![CDATA[In a blog post yesterday (Unintended Acceleration and Other Embedded Software Bugs), I wrote extensively on the report from NASA&#8217;s technical team regarding their analysis of the embedded software in Toyota&#8217;s ETCS-i system. My overall point was that it is hard to judge the quality of their analysis (and thereby the overall conclusion that the [...]]]></description>
			<content:encoded><![CDATA[<p>In a blog post yesterday (<a href="/barr-code/2011/03/unintended-acceleration-and-other-embedded-software-bugs/">Unintended Acceleration and Other Embedded Software Bugs</a>), I wrote extensively on the report from NASA&#8217;s technical team regarding their analysis of the embedded software in Toyota&#8217;s ETCS-i system. My overall point was that it is hard to judge the quality of their analysis (and thereby the overall conclusion that the software isn&#8217;t to blame for unintended accelerations) given the large number of redactions.</p>
<p>I need to put the report down and do some other work at this point, but I have a few other thoughts and observations worth writing down.</p>
<p><strong>Insufficient Explanations</strong></p>
<p>First, some of the explanations offered by Toyota, and apparently accepted by NASA, strike me as insufficent.  For example, at pages 129-132 of <a href="http://www.nhtsa.gov/staticfiles/nvs/pdf/NASA_FR_Appendix_A_Software.pdf">Appendix A</a> to the NASA Report there is a discussion of <a href="http://en.wikipedia.org/wiki/Recursion">recursion</a> in the Toyota firmware. &#8220;The question then is how to verify that the indirect recursion in the ETCS-i does in fact terminate (i.e., has no infinite recursion) and does not cause a stack overflow.&#8221; </p>
<blockquote><p>
&#8220;For the case of stack overflow, [redacted phrase], and therefore a stack overflow condition cannot be detected precisely. It is likely, however, that overflow would cause some form of memory corruption, which would in turn cause some <strong>bad behavior</strong> that would then cause a watchdog timer reset. Toyota relies on this assumption to claim that stack overflow does not occur because no reset occurred during testing.&#8221; (emphasis added)
</p></blockquote>
<p>I have written about what really happens during stack overflow before (<a href="http://embeddedgurus.com/barr-code/2010/03/firmware-specific-bug-4-stack-overflow/">Firmware-Specific Bug #4: Stack Overflow</a>) and this explains why a reset may not result and also why it is so hard to trace a stack overflow back to that root cause. (From page 20, in NASA&#8217;s words: &#8220;The system stack is limited to just 4096 bytes, it is therefore important to secure that no execution can exceed the stack limit. This type of check is normally simple to perform in the absence of recursive procedures, which is standard in safety critical embedded software.&#8221;)</p>
<p>Similarly, &#8220;Toyota designed the software with a high margin of safety with respect to deadlines and timeliness. &#8230; [but] documented no formal verification that all tasks actually meet this deadline requirement.&#8221; and &#8220;All verification of timely behavior is accomplished with CPU load measurements and other measurement-based techniques.&#8221; It&#8217;s not clear to me if the NASA team is saying it buys those Toyota explanations or merely wanted to write them down. However, I do not see a sufficient explanation in this wording from page 132:</p>
<blockquote><p>
&#8220;The [worst case execution time] analysis and recursion analysis involve two distinctly different problems, but they have one thing in common: Both of their failure modes would result in a CPU reset. &#8230; These potential malfunctions, and many others such as concurrency deadlocks and CPU starvation, would <strong>eventually</strong> manifest as a spontaneous system reset.&#8221; (emphasis added)
</p></blockquote>
<p>Might not a <a href="http://embeddedgurus.com/barr-code/2010/11/firmware-specific-bug-7-deadlock/">deadlock</a>, starvation, <a href="http://embeddedgurus.com/barr-code/2010/11/firmware-specific-bug-8-priority-inversion/">priority inversion</a>, or infinite recursion be capable of producing a bit of &#8220;bad behavior&#8221; (perhaps even unintended acceleration) before that &#8220;eventual&#8221; reset? Or might not a stack overflow just corrupt one or a few important variables a little bit and that result in bad behavior rather than or before a result? These kinds of possibilities, even at very low probabilities, are important to consider in light of NASA&#8217;s calculation that the U.S.-owned Camry 2002-2007 fleet alone is running this software a cumulative one billion hours per year.</p>
<p><strong>Paths Not Taken</strong></p>
<p>My second observation is based upon reflection on the steps NASA might have taken in its review of Toyota&#8217;s ETCS-i firmware, but apparently did not. Specifically, there is no mention anywhere (unless it was entirely redacted) of: </p>
<ul>
<li><a href="http://www.netrino.com/Embedded-Systems/How-To/RMA-Rate-Monotonic-Algorithm">rate monotonic analysis</a>, which is a technique that Toyota could have used to validate the critical set of tasks with deadlines and higher priority ISRs (and that NASA could have applied in its review),</li>
<li><a href="http://en.wikipedia.org/wiki/Cyclomatic_complexity">cyclomatic complexity</a>, which NASA might have used as an additional winnowing tool to focus its limited time on particularly complex and hard to test routines,</li>
<li><a href="http://www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/GuidanceDocuments/ucm089543.htm">hazard analysis and mitigation</a>, as those terms are defined by FDA guidelines regarding software contained in medical devices, nor</li>
<li>any discussion or review of Toyota&#8217;s specific software testing regimen and bug tracking system.
</ul>
<p>Importantly, there is also a complete absence of discussion of how Toyota&#8217;s ETCS-i firmware versions evolved over time. Which makes and models (and model years) had which versions of that firmware? (Presumably there were also hardware changes worthy of note.) Were updates or patches ever made to cars once they were sold, say while at the dealer during official recalls or other types of service?</p>
]]></content:encoded>
			<wfw:commentRss>http://embeddedgurus.com/barr-code/2011/03/what-nhtsanasa-didnt-consider-re-toyotas-firmware/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>Firmware-Specific Bug #10: Jitter</title>
		<link>http://embeddedgurus.com/barr-code/2010/12/firmware-specific-bug-10-jitter/</link>
		<comments>http://embeddedgurus.com/barr-code/2010/12/firmware-specific-bug-10-jitter/#comments</comments>
		<pubDate>Thu, 02 Dec 2010 11:56:26 +0000</pubDate>
		<dc:creator>Michael Barr</dc:creator>
				<category><![CDATA[Firmware Bugs]]></category>
		<category><![CDATA[RTOS Multithreading]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[bugs]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[firmware]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[realtime]]></category>
		<category><![CDATA[rtos]]></category>
		<category><![CDATA[safety]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://embeddedgurus.com/barr-code/?p=421</guid>
		<description><![CDATA[Some real-time systems demand not only that a set of deadlines be always met but also that additional timing constraints be observed in the process. Such as managing jitter. An example of jitter is shown in Figure 1. Here a variable amount of work (blue boxes) must be completed before every 10 ms deadline. As [...]]]></description>
			<content:encoded><![CDATA[<p>Some real-time systems demand not only that a set of deadlines be always met but also that additional timing constraints be observed in the process. Such as managing jitter.</p>
<p>An example of jitter is shown in Figure 1. Here a variable amount of work (blue boxes) must be completed before every 10 ms deadline. As illustrated in the figure, the deadlines are all met. However, there is considerable timing variation from one run of this job to the next. This jitter is unacceptable in some systems, which should either start or end their 10 ms runs more precisely.</p>
<p><a href='http://eetimes.com/ContentEETimes/Images/Design/Embedded/2010/1110/1110esdBarr03.gif'>Jitter Figure 1</a></p>
<p>If the work to be performed involves sampling a physical input signal, such as reading an analog-to-digital converter, it will often be the case that a precise sampling period will lead to higher accuracy in derived values. For example, variations in the inter-sample time of an optical encoder&#8217;s pulse count will lower the precision of the velocity of an attached rotation shaft.</p>
<p><em>Best Practice</em>: The most important single factor in the amount of jitter is the relative priority of the task or ISR that implements the recurrent behavior. The higher the priority the lower the jitter. The periodic reads of those encoder pulse counts should thus typically be in a timer tick ISR rather than in an RTOS task. </p>
<p>Figure 2 shows how the interval of three different 10 ms recurring samples might be impacted by their relative priorities. At the highest priority is a timer tick ISR, which executes precisely on the 10 ms interval. (Unless there are higher priority interrupts, of course.) Below that is a high-priority task (TH), which may still be able to meet a recurring 10-ms start time precisely. At the bottom, though, is a low priority task (TL) that has its timing greatly affected by what goes on at higher priority levels. As shown, the interval for the low priority task is 10 ms +/- approximately 5 ms.</p>
<p><a href='http://eetimes.com/ContentEETimes/Images/Design/Embedded/2010/1110/1110esdBarr04.gif'>Jitter Figure 2</a></p>
<p><a href="/barr-code/2010/11/firmware-specific-bug-9-incorrect-priority-assignment/">Firmware-Specific Bug #9</a></p>
]]></content:encoded>
			<wfw:commentRss>http://embeddedgurus.com/barr-code/2010/12/firmware-specific-bug-10-jitter/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>What Belongs in a C .h Header File?</title>
		<link>http://embeddedgurus.com/barr-code/2010/11/what-belongs-in-a-c-h-header-file/</link>
		<comments>http://embeddedgurus.com/barr-code/2010/11/what-belongs-in-a-c-h-header-file/#comments</comments>
		<pubDate>Wed, 10 Nov 2010 16:23:11 +0000</pubDate>
		<dc:creator>Michael Barr</dc:creator>
				<category><![CDATA[Coding Standards]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[firmware]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[standards]]></category>

		<guid isPermaLink="false">http://embeddedgurus.com/barr-code/?p=400</guid>
		<description><![CDATA[What sorts of things should you (or should you not) put in a C language .h header file? When should you create a header file? And why? When I talk to embedded C programmers about hardware interfacing in C or Netrino&#8217;s Embedded C Coding Standard, I often come to see that they lack basic skills [...]]]></description>
			<content:encoded><![CDATA[<p>What sorts of things should you (or should you not) put in a C language .h header file?  When should you create a header file?  And why?</p>
<p>When I talk to embedded C programmers about <a href="http://www.netrino.com/Embedded-Systems/Training-Courses/Embedded-C">hardware interfacing in C</a> or Netrino&#8217;s <a href="http://Netrino.com/Coding-Standard">Embedded C Coding Standard</a>, I often come to see that they lack basic skills and information about the C programming language. This is usually because we are mostly a gang of electrical engineers who are self-taught in C (and every other programming language we use).</p>
<p>When the subject of header files comes up, here&#8217;s my list of do&#8217;s and don&#8217;ts:</p>
<p><strong>DO create one .h header file for each &#8220;module&#8221; of the system.</strong>  A module may comprise one or more compilation units (e.g., .c or .asm source code files).  But it should implement just one aspect of the system.  Examples of well-chosen modules are: a device driver for an A/D converter; a communication protocol, such as FTP; and an alarm manager that is solely responsible for logging error conditions and alerting the user of the active errors.</p>
<p><strong>DO include in the header file all of the function prototypes for the public interface of the module it describes.</strong> For example a header file adc.h might contain function prototypes for adc_init(), adc_select_input(), and adc_read().</p>
<p><strong>DON&#8217;T include in the header file any other function or macro that may lie inside the module source code.</strong> It is desirable to hide these internal &#8220;helper&#8221; functions inside the implementation.  If it&#8217;s not called from any other module, hide it!  (If your module spans several compilation units that need to share a helper function, then create a separate header file just for this purpose.) Module A should only call Module B through the public interface defined in moduleb.h.</p>
<p><strong>DON&#8217;T include any executable lines of code in a header file, including variable declarations.</strong> But note it is necessary to make an exception for the bodies of some <a href="/barr-code/2011/03/do-inline-function-bodies-belong-in-c-header-files/">inline functions</a>.</p>
<p><strong>DON&#8217;T expose any variable in a header file, as is too often done by way of the &#8216;extern&#8217; keyword.</strong> Proper encapsulation of a module requires data hiding: any and all internal state data in private  variables inside the .c source code files.  Whenever possible these variables should also be declared with keyword &#8216;static&#8217; to enlist the linker&#8217;s help in hiding them.</p>
<p><strong>DON&#8217;T expose the internal format of any module-specific data structure passed to or returned from one or more of the module&#8217;s interface functions.</strong> That is to say there should be no &#8220;struct { &#8230; } foo;&#8221; code in any header file.  If you do have a type you need to pass in and out of your module, so client modules can create instances of it, you can simply &#8220;typedef struct foo moduleb_type&#8221; in the header file.  Client modules should never know, and this way cannot know, the internal format of the struct. </p>
<p>Though not really specific to embedded software development, I hope this advice on good C programming practices is useful to you.  If it is please let me know and I will provide more C advice in future blog posts.</p>
]]></content:encoded>
			<wfw:commentRss>http://embeddedgurus.com/barr-code/2010/11/what-belongs-in-a-c-h-header-file/feed/</wfw:commentRss>
		<slash:comments>40</slash:comments>
		</item>
		<item>
		<title>Rate Monotonic Analysis and Round Robin Scheduling</title>
		<link>http://embeddedgurus.com/barr-code/2010/01/rate-monotonic-analysis-and-round-robin-scheduling/</link>
		<comments>http://embeddedgurus.com/barr-code/2010/01/rate-monotonic-analysis-and-round-robin-scheduling/#comments</comments>
		<pubDate>Sat, 23 Jan 2010 00:29:00 +0000</pubDate>
		<dc:creator>Michael Barr</dc:creator>
				<category><![CDATA[RTOS Multithreading]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[firmware]]></category>
		<category><![CDATA[realtime]]></category>
		<category><![CDATA[rtos]]></category>
		<category><![CDATA[safety]]></category>

		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2010/01/22/rate-monotonic-analysis-and-round-robin-scheduling/</guid>
		<description><![CDATA[Rate Monotonic Analysis (RMA) is a way of proving a priori via mathematics (rather than post-implementation via testing) that a set of tasks and interrupt service routines (ISRs) will always meet their deadlines&#8211;even under worst-case timing. &#160;In this blog, I address the issue of what to do if two or more tasks or ISRs have [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.netrino.com/Embedded-Systems/How-To/RMA-Rate-Monotonic-Algorithm">Rate Monotonic Analysis (RMA)</a> is a way of proving <i>a priori</i> via mathematics (rather than post-implementation via testing) that a set of tasks and interrupt service routines (ISRs) will always meet their <a href="http://www.netrino.com/Embedded-Systems/Glossary-D#deadline">deadlines</a>&#8211;even under worst-case timing. &nbsp;In this blog, I address the issue of what to do if two or more tasks or ISRs have equal priority and whether round robin scheduling is necessary in an RTOS to deal with that special case.</p>
<p>First a little background. &nbsp;In order for the schedulability analysis portion of the RMA mathematics to provide meaningful results, the following assumptions must hold:</p>
<ul>
<li>A <a href="http://www.netrino.com/Embedded-Systems/How-To/RTOS-Preemption-Multitasking">preemptive</a> <a href="http://www.netrino.com/Embedded-Systems/How-To/RTOS-Selection">real-time operating system (RTOS)</a> is used for scheduling,</li>
<li>Each <a href="http://www.netrino.com/Embedded-Systems/Glossary-T#task">task</a> or <a href="http://www.netrino.com/Embedded-Systems/Glossary-I#interrupt_service_routine">ISR</a> must be assigned a fixed priority (relative to the others) that is not changed while the system runs, and</li>
<li>Unbounded <a href="http://www.netrino.com/Embedded-Systems/How-To/RTOS-Priority-Inversion">priority inversions</a> must be prevented.</li>
</ul>
<p>Under RMA, the relative priorities are assigned according to a simple rule: &#8220;<b>The more often a task or ISR runs (in the worst-case), the higher its priority.</b>&#8221; Put another way, the task or ISR with the longest period between iterations (<i>interarrival time</i>, if you prefer) is least important. This is because an infrequent but high-priority task could prevent a more frequent task from missing an entire iteration.</p>
<p>So what happens if you are using RMA to assign priorities and you wind up with two (or more) tasks or ISRs assigned equal priority? (Translation: they have the same worst-case interarrival times). Must they be assigned equal priority in the real system? What if the RTOS (in the case of tasks) or hardware (in the case of interrupts) doesn&#8217;t support round-robin scheduling&#8211;or even equal priorities with run-to-completion?</p>
<p>Interestingly, it turns out not to matter a bit whether you:</p>
<ol>
<li>Merge the two tasks into one (i.e., executed code for Task A then Task B).</li>
<li>Give them equal priority, either with round robin or run-to-completion behavior.</li>
<li>Give them adjacent unequal priorities (in either relative order).</li>
</ol>
<p>If you run through the timing diagrams for each of the above scenarios, you&#8217;ll see that all three are equivalent. Except that the equal priority with round robin potentially suffers a performance impact from unnecessary additional context switches.</p>
]]></content:encoded>
			<wfw:commentRss>http://embeddedgurus.com/barr-code/2010/01/rate-monotonic-analysis-and-round-robin-scheduling/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Is Reliable Multithreaded Software Possible?</title>
		<link>http://embeddedgurus.com/barr-code/2009/12/is-reliable-multithreaded-software-possible/</link>
		<comments>http://embeddedgurus.com/barr-code/2009/12/is-reliable-multithreaded-software-possible/#comments</comments>
		<pubDate>Wed, 23 Dec 2009 19:48:00 +0000</pubDate>
		<dc:creator>Michael Barr</dc:creator>
				<category><![CDATA[RTOS Multithreading]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[firmware]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[rtos]]></category>

		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2009/12/23/is-reliable-multithreaded-software-possible/</guid>
		<description><![CDATA[Until earlier this month, I&#8217;d overlooked a most interesting May 2006 article in Embedded Software Design magazine by Mark Bereit titled &#8220;Escape the Software Development Paradigm Trap&#8220;. The article opines that the methods we use to design embedded software, particularly multitasked software with interrupt service routines and/or real-time operating systems, are fundamentally incompatible with reliability. [...]]]></description>
			<content:encoded><![CDATA[<p>Until earlier this month, I&#8217;d overlooked a most interesting May 2006 article in <a href="http://www.embedded.com">Embedded Software Design magazine</a> by <a href="http://www.markbereit.com">Mark Bereit</a> titled &#8220;<a href="http://www.embedded.com/columns/showArticle.jhtml?articleID=186700597">Escape the Software Development Paradigm Trap</a>&#8220;.  The article opines that the methods we use to design embedded software, particularly multitasked software with interrupt service routines and/or real-time operating systems, are fundamentally incompatible with reliability.</p>
<p>Here&#8217;s the critical analogy: </p>
<blockquote><p>Imagine for a minute that I&#8217;ve invented the Universal Bolt. This is a metal object for joining threaded holes that can extend or collapse to fit a variety of lengths. It can expand or contract to fit holes of different diameters. The really cool feature is that I have replaced the bolt&#8217;s spiral ridge with a series of extendable probes that can accommodate different thread pitches. You no longer need to stock a variety of bolts of different sizes and lengths and thread spacings because my Universal Bolt can be used in place of any of them.</p>
<p>Because it&#8217;s able to change configurations extremely quickly, a single Universal Bolt can take the place of many conventional bolts simultaneously. What we do is rig up a clever and very fast dispatcher device that quickly moves the [Universal Bolt] from hole to hole. If the dispatcher is fast enough, my Universal Bolt can spend a moment in each hole in turn and get the whole way through your [mechanical] product so fast that it returns to each hole before the joint has had a chance to separate.</p></blockquote>
<p>You&#8217;d have to be crazy to fly in an airplane designed this way.  &#8220;If anything caused the dispatcher to derail, the entire product would collapse in a second.&#8221;  Yet this analogy describes the design of most products powered by embedded computers.</p>
<blockquote><p>A fast and complex thread dispatcher keeps moving one simple and stupid integer-computation unit all over a big system tending to tasks [and ISRs] rapidly enough that they all get done.  And if that dispatcher ever once leads the CPU into an invalid memory address the whole thing crashes to a halt.</p></blockquote>
<p>Clearly, we need a new paradigm for reliable embedded software architecture.  My thoughts on that are coming to this space in 2010.</p>
]]></content:encoded>
			<wfw:commentRss>http://embeddedgurus.com/barr-code/2009/12/is-reliable-multithreaded-software-possible/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Robust Embedded Software Architecture in 5 Easy Steps</title>
		<link>http://embeddedgurus.com/barr-code/2009/09/robust-embedded-software-architecture-in-5-easy-steps/</link>
		<comments>http://embeddedgurus.com/barr-code/2009/09/robust-embedded-software-architecture-in-5-easy-steps/#comments</comments>
		<pubDate>Thu, 17 Sep 2009 13:32:00 +0000</pubDate>
		<dc:creator>Michael Barr</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[realtime]]></category>

		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2009/09/17/robust-embedded-software-architecture-in-5-easy-steps/</guid>
		<description><![CDATA[Over the past few years, I’ve spent a large amount of my time consulting with and training software development teams that are in the midst of rearchitecture. These teams have already developed the firmware inside successful long-lived products or product families. But to keep moving forward, reduce bugs, and speed new feature development, they need [...]]]></description>
			<content:encoded><![CDATA[<p>Over the past few years, I’ve spent a large amount of my time consulting with and training software development teams that are in the midst of rearchitecture.  These teams have already developed the firmware inside successful long-lived products or product families.  But to keep moving forward, reduce bugs, and speed new feature development, they need to take the best of their old code and plug it into a better firmware architecture.</p>
<p>In the process, I have collected substantial anecdotal evidence that few programmers, technical managers, or teams truly understand what good firmware architecture is, how to achieve it, or even how to recognize it when they see it.  That includes the most experienced individual developers on a team.  Yet, despite the fact that these teams work in a range of very different industries (including safety-critical medical devices), the rearchitecture process is remarkably similar from my point of view.  And there are numerous ways that our clients’ products and engineering teams would have benefited from getting their firmware architecture right from the beginning.</p>
<p>Though learning to create solid firmware architecture and simultaneously rearchitecting legacy software may take a team months of hard work, five key steps are easily identified.  So whether you are designing firmware architecture from scratch for a new product or launching a rearchitecture effort of your own, here is a step-by-step process to help your team get started on the right foot.</p>
<p><span style="font-weight: bold">Step 1: Identify the Requirements</span></p>
<p>Before we can begin to (re)architect an embedded system or its firmware, we must have clear requirements.  Properly written requirements define the WHAT of a product.  WHAT does it the product for the user, specifically?  For example, if the product is a ventilator the list of WHAT it does may include a statement such as:</p>
<blockquote><p><span style="font-style: italic">If power is lost during operation, the ventilator shall resume operation according to its last programmed settings within 250 ms of power up.</span></p></blockquote>
<p>Note that a properly written requirement is silent about HOW this particular part of the overall WHAT is to be achieved.  The implementation could be purely electronics or a combination of electronics and firmware; the firmware, if present, might contain an RTOS or it might not.  From the point of view of the requirement writer, then, there may as well be a gnome living inside the product that fulfills the requirement.   (So long as the gnome is trustworthy and immortal, of course!)</p>
<p>Each requirement statement must also be two other things: unambiguous and testable.  An unambiguous statement requires no further explanation.  It is as clear and as concise as possible.  If the requirement includes a mathematical model of expected system behavior, it is helpful to include the equations.</p>
<p>Testability is key.  If a requirement is written properly, a set of tests can be easily constructed to verify that requirement is met.  Decoupling the tests from the particulars of the implementation, in this manner, is of critical importance.  Many organizations perform extensive testing of the wrong stuff.  Any coupling between the test and the implementation is problematic.</p>
<p>A proper set of requirements is a written list of statements each of which contains the key phrase &#8220;… the [product] shall &#8230;&#8221; and is silent about how it is implemented, unambiguous, and testable.  This may seem like a subject unrelated to architecture, but too often it is poor requirements that constrain architecture.  Thus good architecture depends in part on good requirements.</p>
<p><span style="font-weight: bold">Step 2: Distinguish Architecture from Design</span></p>
<p>Over the years, I have found that many engineers (as well as their managers) struggle to separate the various elements or layers of firmware engineering.  For example, Netrino is barraged with requests for &#8220;design reviews&#8221; that turn out to be &#8220;code reviews&#8221; because the customer is confused about the meaning of &#8220;design&#8221;.  This even happens in organizations that follow a defined software development lifecycle.  We need to clear this up.</p>
<p>The architecture of a system is the outermost layer of HOW. Architecture describes persistent features; the architecture is hard to change and must be got right through careful thinking about intended and permissible uses of the product.  By analogy, an architect describes a new office building only very broadly.  A scale model and drawings show the outer dimensions, foundation, and number of floors.  The number of rooms on each floor and their specific uses are not part of the architecture.</p>
<p>Architecture is best documented via a collection of block diagrams, with directional arrows connecting subsystems.  The system architecture diagram identifies data flows and shows partitioning at the hardware vs. firmware level.  Drilling down, the firmware architecture diagram identifies subsystem-level blocks such as device drivers, RTOS, middleware, and major application components. These architectural diagrams should not have to change even as roadmap features are added to the product—at least for the next few years.  Architectural diagrams should also pass the “six pack test,” which says that even after drinking a six pack every member of the team should still be able to understand the architecture; it is devoid of confusing details and has as few named components as possible.</p>
<p>The design of a system is the middle layer of HOW.  The architecture does not include function or variable names.  A firmware design document identifies these fine-grained details, such as the names and responsibilities of tasks within the specific subsystems or device drivers, the brand of RTOS (if one is used), and the details of the interfaces between subsystems.  The design documents class, task, function/method, parameter and variable names that must be agreed upon by all implementers.  This is similar to how a design firm hired by the renter of a floor on the office building describes the interior and exterior of the new building in finer detail than the architect.  Designers locate and name rooms and give them specific purposes (e.g., cube farm, corner office, or conference room).</p>
<p>An implementation is the lowest layer of HOW.  There need be no document, other than the source code or schematics, to describe the implementation details.  If the interfaces are defined sufficiently at the design level above, individual engineers are able to begin implementation of the various component parts in parallel.  This is similar to the way that a carpenter, plumber, and electrician work in parallel in nearby space, applying their own judgment about the finer details of component placement, after the design has been approved by the lessee.</p>
<p>Of course, there is architecture and there is good architecture.  Good architecture makes the most difficult parts of the project easy.  These difficult parts vary in importance somewhat from industry to industry, but always center on three big challenges that must be traded off against each other: meeting real-time deadlines, testing, and diversity management.  Addressing those issues comprise the final three steps.</p>
<p><span style="font-weight: bold">Step 3: Manage CPU Time</span></p>
<p>Some of your product’s requirements will mention explicit amounts of time.  For example, consider the earlier ventilator requirement about doing something “within 250 ms of power up.”  That is a timeliness requirement.  “Within 250 ms of power up” is just one deadline for the ventilator implementation team to meet.  (And something to be tested under a variety of scenarios.)  The architecture should make it easy to meet this deadline, as well as to be certain it will always be met.</p>
<p>Most products feature a mix of non-real-time, soft-real-time, and hard-real-time requirements.  Soft deadlines are usually the most challenging to<br />
define in an unambiguous manner, test, and implement.  For example, in set-top box design it may be acceptable to drop a frame of video once in a while, but never more than two in a row, and never any audio, which arrives in the same digital input stream.  The simplest way to handle soft deadlines is to treat them as hard deadlines that must always be met.</p>
<p>With deadlines identified, the first step in architecture is to push as many of the timeliness requirements as possible out of the software and onto the hardware.  Figure 1 shows the preferred placement of real-time functionality.  As indicated, an FPGA or a dedicated CPU is the ideal place to put real-time functionality (irrespective of the length of the deadline).  Only when that is not possible, should an interrupt service routine (ISR) be used instead.  And only when an ISR won’t work should a high-priority task be used.</p>
<p style="text-align: center">
<div style="text-align: center"><a href="http://embeddedgurus.com/barr-code/files/2009/09/BarrCode_200909_Architecture_fig1.gif"><img class="aligncenter size-medium wp-image-425" title="Where to Put Real-Time Functionality" src="http://embeddedgurus.com/barr-code/files/2009/09/BarrCode_200909_Architecture_fig1-300x147.gif" alt="Where to Put Real-Time Functionality" width="300" height="147" /></a></div>
<p><span style="font-style: italic">Figure 1.  Where to Put Real-Time Functionality</span></p>
<p>Keeping the real-time functionality separate from the bulk of the software is valuable for two important reasons.  First, because it simplifies the design and implementation of the non-real-time software.  With timeliness requirements architected out of the bulk of the software, code written by novice implementers can be used without affecting user safety.</p>
<p>A second advantage of keeping the real-time functionality together is it simplifies the analysis involved in proving all deadlines are always met.  If all of the real-time software is segregated into ISRs and high-priority tasks, the amount of work required to perform rate monotonic analysis (RMA) is significantly reduced.  Additionally, once the RMA analysis is completed, it need not be revised every time the non-real-time code is tweaked or added to.</p>
<p><span style="font-weight: bold">Step 4: Design for Test</span></p>
<p>Every embedded system needs to be tested.  Generally, it is also valuable or mandatory that testing be performed at several levels.  The most common levels of testing are:</p>
<p>• <span style="font-style: italic">System Tests</span> verify that the product as a whole meets or exceeds the stated requirements. System tests are generally best developed outside of the engineering department, though they may fit into a test harness developed by engineers.</p>
<p>• <span style="font-style: italic">Integration Tests</span> verify that a subset of the subsystems identified in the architecture diagrams interact as expected and produce reasonable outcomes.  Integration tests are generally best developed by a testing group or person within software engineering.</p>
<p>• <span style="font-style: italic">Unit Tests</span> verify that individual software components identified at the intermediate design level perform as their implementers expect.  That is, they test at the level of the public API the component presents to other components.  Unit tests are generally best developed by the same people that write the code under test.</p>
<p>Of the three, system tests are most easily developed, as those test the product at its exposed hardware interfaces to the world (e.g., does the dialysis machine perform as required).  Of course, a test harness may need to be developed for engineering and/or factory acceptance tests.  But this is generally still easier than integration and unit tests, which demand additional visibility inside the device as it operates.</p>
<p>To make the development, use, and maintenance of integration and unit tests easy it is valuable to architect the firmware in a manner compatible with a software test framework.  The single best way to do this is to architect the interactions between all software components at the levels you want to test so they are based on publish-subscribe event passing (a.k.a., message passing).</p>
<p>Interaction based on a publish-subscribe model allows a lightweight test framework like the one shown in Figure 2 to be inserted alongside the software component(s) under test.  Any interface between the test framework and the outside world, such as a serial port, provides an easy way to inject or log events.  A test engine on the other side of that communications interface can then be designed to accept test “scripts” as input, log subscribed event occurrences, and off-line check logged events against valid result sequences.  Adding timestamps to the event logger and scripting language features like delay(time) and waitfor(event) significantly increases testing capability.</p>
<p style="text-align: center">
<div style="text-align: center"><a href="http://embeddedgurus.com/barr-code/files/2009/09/BarrCode_200909_Architecture_fig2.gif"><img class="aligncenter size-medium wp-image-426" title="A Test Framework Based on a Publish-Subscribe Event Bus" src="http://embeddedgurus.com/barr-code/files/2009/09/BarrCode_200909_Architecture_fig2-300x175.gif" alt="A Test Framework Based on a Publish-Subscribe Event Bus" width="300" height="175" /></a></div>
<p><span style="font-style: italic">Figure 2.  A Test Framework Based on a Publish-Subscribe Event Bus</span></p>
<p>It is unfortunate that the publish-subscribe component interaction model is at odds with proven methods of analyzing software schedulability (e.g., RMA).  The sheer number of possible message arrival orders, queue depths, and other details make the analysis portion of guaranteeing timeliness difficult and fragile against minor implementation changes.  This is, in fact, why it is important to separate the code that must meet deadlines from the rest of the software.  In this architecture, though, the real-time functionality remains difficult to test other than at the system level.</p>
<p><span style="font-weight: bold">Step 5: Plan for Change</span></p>
<p>The third key consideration during the firmware architecture phase of the project is the management of feature diversity and product customizations.  Many companies use a single source code base to build firmware for a family of related products.  For example, consider microwave ovens; though one high-end model may feature a dedicated “popcorn” button, another may lack this.  The architecture of any new product’s firmware will also soon be tested and stretched in the direction of foreseeable planned feature additions along the product road map.</p>
<p>To plan for change, you must first understand the types of changes that occur in your specific product.  Then architect the firmware so that those sorts of changes are the easiest to make.  If the software is architected well, feature diversity can be managed through a single software build with compile-time and/or run-time behavioral switches in the firmware.  Similarly, new features can be added easily to a good architecture without breaking the existing product’s functionality.</p>
<p>An architectural approach that handles product family diversity particularly well is one in which groups of related software components are collected into “packages”.  Each such package is effectively an internal widget from which larger products can be built.  The source code and unit tests for each particular package should be maintained by a team of “package developers” focused primarily on their stability and ease of use.</p>
<p>Teams of “product developers” combine stable releases of packages that contain the features they need, customize each as appropriate (e.g., via compile- or run-time mechanisms, or both) to their particular product, and add product-specific “glue.”  Typically, all of the products in a related product family are built upon a common “Foundation” package (think API).  For example a Model X microwave might be built from Foundation + Package A + Package B; whereas Model Y might consist of Foundation + A’ + B + C, where package A’ is a compile-time variant of package A and package C contains optional high-leve<br />
l cooking features, such as “Popcorn.”</p>
<p>Using this approach in a large organization, a new product built from a selection of stable bug-free packages can be brought to market quickly—and all products share an easy upgrade path as their core packages are improved.  The main challenge in planning for change of this sort is in striking the right balance between packages that are too small and packages that are too large.   Like many of the details of firmware architecture, achieving that balance for a number of years is more of an art than a science.</p>
<p><span style="font-weight: bold">Next Steps</span></p>
<p>I hope the five-step “architecture road map” presented here is useful to you.  I plan to drill down into more of the details in articles and blog posts over the coming months.  Meanwhile your constructive feedback is welcome via the comment form or e-mail.</p>
]]></content:encoded>
			<wfw:commentRss>http://embeddedgurus.com/barr-code/2009/09/robust-embedded-software-architecture-in-5-easy-steps/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

