<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Efficient C Tips #10 &#8211; Use unsigned integers</title>
	<atom:link href="http://embeddedgurus.com/stack-overflow/2009/07/efficient-c-tips-10-use-unsigned-integers/feed/" rel="self" type="application/rss+xml" />
	<link>http://embeddedgurus.com/stack-overflow/2009/07/efficient-c-tips-10-use-unsigned-integers/</link>
	<description>Thoughts on embedded systems by Nigel Jones</description>
	<lastBuildDate>Thu, 09 Feb 2012 07:32:39 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Misha</title>
		<link>http://embeddedgurus.com/stack-overflow/2009/07/efficient-c-tips-10-use-unsigned-integers/comment-page-1/#comment-7366</link>
		<dc:creator>Misha</dc:creator>
		<pubDate>Sun, 09 Oct 2011 15:29:26 +0000</pubDate>
		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2009/07/31/efficient-c-tips-10-use-unsigned-integers/#comment-7366</guid>
		<description>Yes, you interpreted it correctly.

Here are the answers (to the best of my knowledge) to the questions you raised.

&quot;Let’s say you perform a series of additions on unsigned integers. Does the compiler zero out the guard bits after every operation, or does it simply wait until the end of the series and then perform the mask?&quot;
This particular compiler (for Cirrus Logic DSP) performs the mask only in two places: before doing comparison and before shifting right. If I&#039;m not mistaking, those are the only places where it is important. Of course, memory load/store instructions that do not perform saturation just ignore the guard part, so no problem there, and the same thing is happening when performing multiplication, etc.
So, I believe that in a way it is different, i.e. simpler, than the case when you have, e.g. 16 bit arithmetic hardware, but you are using 8 bit data type. The case in DSP is that, for example, you use 16 bit data type, your hardware supports 16 bit arithmetic, but your register has additional special purpose bits on the left.


&quot;What happens if you do the same thing with signed integers?&quot;
With sign integers nothing is done with the guard bits. Since signed overflow is undefined in C standard, DSP hardware and compiler define it in a way that it produces the best code. But, it is not explained too much in the manual of the DSP I&#039;m currently working with, rather they took a position that &quot;overflow is bad&quot; and programmer should avoid it.


&quot;Also if you actually get an overflow with signed integers, do you get the ‘expected’ result, or do you get garbage?&quot;
If by &quot;expected&quot; you mean &quot;always the same&quot;, or &quot;determinable&quot;, result, than you do get &quot;expected&quot; result :) But it is definitely different that what you would get if overflow is handled as wrap around. I call the behavior &quot;extended wrap around&quot; or &quot;delayed wrap around&quot; because wrap around does happen but only after guard part is filled up.


Here is another interesting thing.
Luís Fernando says:
&quot;My ideal solution would be leaving unsigned overflow unspecified and adding a modular modifier. Or you could add a type attribute to the compiler.&quot;
The DSP compiler I&#039;m working with supports C language extension called &quot;C - Extensions to support embedded processors&quot; (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf). It targets mostly DSP related issues. The main thing the extension introduced are fixed-point types. When defining fixed-point types overflow behavior they did something similar to what Luís Fernando proposed. If type has &quot;sat&quot; word (e.g. &quot;sat fract&quot;) then saturation has to happen on overflow, otherwise the behavior is undefined, that is, it should be done in the most efficient manner on the target processor. There is also a pragma that can force saturation on all fixed-point types. Similar thing might be done for signed integers.</description>
		<content:encoded><![CDATA[<p>Yes, you interpreted it correctly.</p>
<p>Here are the answers (to the best of my knowledge) to the questions you raised.</p>
<p>&#8220;Let’s say you perform a series of additions on unsigned integers. Does the compiler zero out the guard bits after every operation, or does it simply wait until the end of the series and then perform the mask?&#8221;<br />
This particular compiler (for Cirrus Logic DSP) performs the mask only in two places: before doing comparison and before shifting right. If I&#8217;m not mistaking, those are the only places where it is important. Of course, memory load/store instructions that do not perform saturation just ignore the guard part, so no problem there, and the same thing is happening when performing multiplication, etc.<br />
So, I believe that in a way it is different, i.e. simpler, than the case when you have, e.g. 16 bit arithmetic hardware, but you are using 8 bit data type. The case in DSP is that, for example, you use 16 bit data type, your hardware supports 16 bit arithmetic, but your register has additional special purpose bits on the left.</p>
<p>&#8220;What happens if you do the same thing with signed integers?&#8221;<br />
With sign integers nothing is done with the guard bits. Since signed overflow is undefined in C standard, DSP hardware and compiler define it in a way that it produces the best code. But, it is not explained too much in the manual of the DSP I&#8217;m currently working with, rather they took a position that &#8220;overflow is bad&#8221; and programmer should avoid it.</p>
<p>&#8220;Also if you actually get an overflow with signed integers, do you get the ‘expected’ result, or do you get garbage?&#8221;<br />
If by &#8220;expected&#8221; you mean &#8220;always the same&#8221;, or &#8220;determinable&#8221;, result, than you do get &#8220;expected&#8221; result <img src='http://embeddedgurus.com/stack-overflow/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  But it is definitely different that what you would get if overflow is handled as wrap around. I call the behavior &#8220;extended wrap around&#8221; or &#8220;delayed wrap around&#8221; because wrap around does happen but only after guard part is filled up.</p>
<p>Here is another interesting thing.<br />
Luís Fernando says:<br />
&#8220;My ideal solution would be leaving unsigned overflow unspecified and adding a modular modifier. Or you could add a type attribute to the compiler.&#8221;<br />
The DSP compiler I&#8217;m working with supports C language extension called &#8220;C &#8211; Extensions to support embedded processors&#8221; (<a href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf" rel="nofollow">http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf</a>). It targets mostly DSP related issues. The main thing the extension introduced are fixed-point types. When defining fixed-point types overflow behavior they did something similar to what Luís Fernando proposed. If type has &#8220;sat&#8221; word (e.g. &#8220;sat fract&#8221;) then saturation has to happen on overflow, otherwise the behavior is undefined, that is, it should be done in the most efficient manner on the target processor. There is also a pragma that can force saturation on all fixed-point types. Similar thing might be done for signed integers.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nigel Jones</title>
		<link>http://embeddedgurus.com/stack-overflow/2009/07/efficient-c-tips-10-use-unsigned-integers/comment-page-1/#comment-6935</link>
		<dc:creator>Nigel Jones</dc:creator>
		<pubDate>Thu, 29 Sep 2011 14:43:54 +0000</pubDate>
		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2009/07/31/efficient-c-tips-10-use-unsigned-integers/#comment-6935</guid>
		<description>Very interesting Misha. If I interpret this correctly, what&#039;s going on is this. The C language defines what happens when you overflow unsigned integers (namely modulo arithmetic) and thus the compiler is forced to add a zero operation on the guard bits. By contrast, for signed integers the effect of overflow is undefined, and thus no such zeroing is required.  I&#039;m intrigued by the following. Let&#039;s say you perform a series of additions on unsigned integers. Does the compiler zero out the guard bits after every operation, or does it simply wait until the end of the series and then perform the mask? What happens if you do the same thing with signed integers? Also if you actually get an overflow with signed integers, do you get the &#039;expected&#039; result, or do you get garbage?</description>
		<content:encoded><![CDATA[<p>Very interesting Misha. If I interpret this correctly, what&#8217;s going on is this. The C language defines what happens when you overflow unsigned integers (namely modulo arithmetic) and thus the compiler is forced to add a zero operation on the guard bits. By contrast, for signed integers the effect of overflow is undefined, and thus no such zeroing is required.  I&#8217;m intrigued by the following. Let&#8217;s say you perform a series of additions on unsigned integers. Does the compiler zero out the guard bits after every operation, or does it simply wait until the end of the series and then perform the mask? What happens if you do the same thing with signed integers? Also if you actually get an overflow with signed integers, do you get the &#8216;expected&#8217; result, or do you get garbage?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Misha</title>
		<link>http://embeddedgurus.com/stack-overflow/2009/07/efficient-c-tips-10-use-unsigned-integers/comment-page-1/#comment-6934</link>
		<dc:creator>Misha</dc:creator>
		<pubDate>Thu, 29 Sep 2011 14:06:25 +0000</pubDate>
		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2009/07/31/efficient-c-tips-10-use-unsigned-integers/#comment-6934</guid>
		<description>Just couple days ago I stared reading this site. In my opinion all the tips are great and explained exceptionally well. Other posts are good also, thank you for sharing experience Nigel.

I come from DSP embedded world, which is slightly different than what is in focus here, but I must say that all the tips (13 so far) apply to that domain as well -- except this one :)

The reason why signed int is producing better code in DSP processors is explained well in comments of Luís Fernando Schultz Xavier da Silveira. He was considering the case when size of type is less than size of register (supported arithmetic), e.g. 8 bit int in 16 bit register. Well, this is the case practically all the time in DSP. The fact is that accumulator registers in DSPs almost always have some kind of guard bits, which are to catch overflow and subsequently perform saturation. Therefore for unsigned integers guard bits must be zeroed by compiler either before every comparison, or after every add/sub and shift left operations.

However, in your example with index boundary check, I think the second code would still be better in most of the cases, because one comparison and bits zeroing beats two comparisons :)
(Just tried with, not commonly used, Cirrus Logic DSP and the result is 4 against 5 instructions for second version)
On the other hand, in case of, for example, a simple comparison unsigned integers will introduce two extra instructions (one guard zeroing for every operand).

Anyhow, I just wanted to share my experience regarding this tip, because, as I said, all other tips perfectly apply to DSP world as well.</description>
		<content:encoded><![CDATA[<p>Just couple days ago I stared reading this site. In my opinion all the tips are great and explained exceptionally well. Other posts are good also, thank you for sharing experience Nigel.</p>
<p>I come from DSP embedded world, which is slightly different than what is in focus here, but I must say that all the tips (13 so far) apply to that domain as well &#8212; except this one <img src='http://embeddedgurus.com/stack-overflow/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>The reason why signed int is producing better code in DSP processors is explained well in comments of Luís Fernando Schultz Xavier da Silveira. He was considering the case when size of type is less than size of register (supported arithmetic), e.g. 8 bit int in 16 bit register. Well, this is the case practically all the time in DSP. The fact is that accumulator registers in DSPs almost always have some kind of guard bits, which are to catch overflow and subsequently perform saturation. Therefore for unsigned integers guard bits must be zeroed by compiler either before every comparison, or after every add/sub and shift left operations.</p>
<p>However, in your example with index boundary check, I think the second code would still be better in most of the cases, because one comparison and bits zeroing beats two comparisons <img src='http://embeddedgurus.com/stack-overflow/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /><br />
(Just tried with, not commonly used, Cirrus Logic DSP and the result is 4 against 5 instructions for second version)<br />
On the other hand, in case of, for example, a simple comparison unsigned integers will introduce two extra instructions (one guard zeroing for every operand).</p>
<p>Anyhow, I just wanted to share my experience regarding this tip, because, as I said, all other tips perfectly apply to DSP world as well.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nigel Jones</title>
		<link>http://embeddedgurus.com/stack-overflow/2009/07/efficient-c-tips-10-use-unsigned-integers/comment-page-1/#comment-3449</link>
		<dc:creator>Nigel Jones</dc:creator>
		<pubDate>Wed, 23 Feb 2011 19:40:55 +0000</pubDate>
		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2009/07/31/efficient-c-tips-10-use-unsigned-integers/#comment-3449</guid>
		<description>Thank you. I really appreciate it when someone takes the time to inform me of these problems. It&#039;s now fixed.</description>
		<content:encoded><![CDATA[<p>Thank you. I really appreciate it when someone takes the time to inform me of these problems. It&#8217;s now fixed.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: itsnotvalid</title>
		<link>http://embeddedgurus.com/stack-overflow/2009/07/efficient-c-tips-10-use-unsigned-integers/comment-page-1/#comment-3440</link>
		<dc:creator>itsnotvalid</dc:creator>
		<pubDate>Tue, 22 Feb 2011 21:25:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2009/07/31/efficient-c-tips-10-use-unsigned-integers/#comment-3440</guid>
		<description>The link referring to &quot;Signed versus unsigned integers&quot; should be http://www.embeddedgurus.net/stack-overflow/2009/05/signed-versus-unsigned-integers/. Probably leftovers to changed URL schemes.</description>
		<content:encoded><![CDATA[<p>The link referring to &#8220;Signed versus unsigned integers&#8221; should be <a href="http://www.embeddedgurus.net/stack-overflow/2009/05/signed-versus-unsigned-integers/" rel="nofollow">http://www.embeddedgurus.net/stack-overflow/2009/05/signed-versus-unsigned-integers/</a>. Probably leftovers to changed URL schemes.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Luís Fernando Schultz Xavier da Silveira</title>
		<link>http://embeddedgurus.com/stack-overflow/2009/07/efficient-c-tips-10-use-unsigned-integers/comment-page-1/#comment-2612</link>
		<dc:creator>Luís Fernando Schultz Xavier da Silveira</dc:creator>
		<pubDate>Tue, 09 Nov 2010 21:01:10 +0000</pubDate>
		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2009/07/31/efficient-c-tips-10-use-unsigned-integers/#comment-2612</guid>
		<description>In this case, the instructions

\ 0000000A 01D1 BNE.N ??f_0
\ 0000000C 0120 MOVS R0,#+1
\ 0000000E 7047 BX LR
\ ??f_0:
\ 00000010 0020 MOVS R0,#+0
\ 00000012 7047 BX LR ;; return

could be replaced by

MOVEQ R0, 1
MOVNE R0, 0
BX R14

for a faster and smaller executable, no?</description>
		<content:encoded><![CDATA[<p>In this case, the instructions</p>
<p>\ 0000000A 01D1 BNE.N ??f_0<br />
\ 0000000C 0120 MOVS R0,#+1<br />
\ 0000000E 7047 BX LR<br />
\ ??f_0:<br />
\ 00000010 0020 MOVS R0,#+0<br />
\ 00000012 7047 BX LR ;; return</p>
<p>could be replaced by</p>
<p>MOVEQ R0, 1<br />
MOVNE R0, 0<br />
BX R14</p>
<p>for a faster and smaller executable, no?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nigel Jones</title>
		<link>http://embeddedgurus.com/stack-overflow/2009/07/efficient-c-tips-10-use-unsigned-integers/comment-page-1/#comment-2611</link>
		<dc:creator>Nigel Jones</dc:creator>
		<pubDate>Tue, 09 Nov 2010 20:40:11 +0000</pubDate>
		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2009/07/31/efficient-c-tips-10-use-unsigned-integers/#comment-2611</guid>
		<description>Balanced optimization is when the compiler attempts to achieve a balance between size and speed.</description>
		<content:encoded><![CDATA[<p>Balanced optimization is when the compiler attempts to achieve a balance between size and speed.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Luís Fernando Schultz Xavier da Silveira</title>
		<link>http://embeddedgurus.com/stack-overflow/2009/07/efficient-c-tips-10-use-unsigned-integers/comment-page-1/#comment-2610</link>
		<dc:creator>Luís Fernando Schultz Xavier da Silveira</dc:creator>
		<pubDate>Tue, 09 Nov 2010 19:16:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2009/07/31/efficient-c-tips-10-use-unsigned-integers/#comment-2610</guid>
		<description>What is full &#039;balanced&#039; optimization?

Just to clarify something, I share your view on how to use unsigned and signed types. My ideal solution would be leaving unsigned overflow unspecified and adding a modular modifier. Or you could add a type attribute to the compiler. My current solution is using different types when declaring variables that should go on registers (u8_reg_t), which is a pain.

I also have searched a lot for an answer, but found nothing, not even a concern about the problem.</description>
		<content:encoded><![CDATA[<p>What is full &#8216;balanced&#8217; optimization?</p>
<p>Just to clarify something, I share your view on how to use unsigned and signed types. My ideal solution would be leaving unsigned overflow unspecified and adding a modular modifier. Or you could add a type attribute to the compiler. My current solution is using different types when declaring variables that should go on registers (u8_reg_t), which is a pain.</p>
<p>I also have searched a lot for an answer, but found nothing, not even a concern about the problem.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nigel Jones</title>
		<link>http://embeddedgurus.com/stack-overflow/2009/07/efficient-c-tips-10-use-unsigned-integers/comment-page-1/#comment-2609</link>
		<dc:creator>Nigel Jones</dc:creator>
		<pubDate>Tue, 09 Nov 2010 16:40:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2009/07/31/efficient-c-tips-10-use-unsigned-integers/#comment-2609</guid>
		<description>I was in a bit of a hurry when I posted the code. I should have stated that it was compiled with full &#039;balanced&#039; optimization. Anyway I think you have an interesting point - and certainly it isn&#039;t one I have considered before. In my experience unsigned code is never slower than signed code (and is often faster). However I do most of my work on 8 / 16 bit machines where the issue of sign extension is less of an issue. You have given me something to ponder and I will continue to perform ad hoc benchmarking with my code to see if this continues to be the case with the move to 32 bit processors. I&#039;d be interested to hear from anyone who is doing a lot of 32 / 64 bit work who has any thoughts on this topic.</description>
		<content:encoded><![CDATA[<p>I was in a bit of a hurry when I posted the code. I should have stated that it was compiled with full &#8216;balanced&#8217; optimization. Anyway I think you have an interesting point &#8211; and certainly it isn&#8217;t one I have considered before. In my experience unsigned code is never slower than signed code (and is often faster). However I do most of my work on 8 / 16 bit machines where the issue of sign extension is less of an issue. You have given me something to ponder and I will continue to perform ad hoc benchmarking with my code to see if this continues to be the case with the move to 32 bit processors. I&#8217;d be interested to hear from anyone who is doing a lot of 32 / 64 bit work who has any thoughts on this topic.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Luís Fernando Schultz Xavier da Silveira</title>
		<link>http://embeddedgurus.com/stack-overflow/2009/07/efficient-c-tips-10-use-unsigned-integers/comment-page-1/#comment-2608</link>
		<dc:creator>Luís Fernando Schultz Xavier da Silveira</dc:creator>
		<pubDate>Tue, 09 Nov 2010 14:50:52 +0000</pubDate>
		<guid isPermaLink="false">http://www.gfcdev.org/test-stack/2009/07/31/efficient-c-tips-10-use-unsigned-integers/#comment-2608</guid>
		<description>I assume the UXTH and SXTH instructions perform zero extension and sign extension, respectively. Doesn&#039;t the ARM ABI specify that halfword values passed in registers are already zero/sign extended? Even if it does not, this is sort of an ABI problem. My whole point is that if the ABI demanded sign extension to signed values, the SXTH instructions would not be necessary, while the UXTH instructions still are (even if zero extension is demanded).

By the way, is optimization turned on? I don&#039;t see why the use of predicated opcodes was abandoned in favor of plain branching.

Well, I still believe there is a fundamental problem with C providing signed types without overflow semantics but not doing the same with unsigned types. Maybe there should be signed, unsigned and modular types. However, since other languages (such as ada) do the same, maybe I am wrong, but I can&#039;t grasp why.</description>
		<content:encoded><![CDATA[<p>I assume the UXTH and SXTH instructions perform zero extension and sign extension, respectively. Doesn&#8217;t the ARM ABI specify that halfword values passed in registers are already zero/sign extended? Even if it does not, this is sort of an ABI problem. My whole point is that if the ABI demanded sign extension to signed values, the SXTH instructions would not be necessary, while the UXTH instructions still are (even if zero extension is demanded).</p>
<p>By the way, is optimization turned on? I don&#8217;t see why the use of predicated opcodes was abandoned in favor of plain branching.</p>
<p>Well, I still believe there is a fundamental problem with C providing signed types without overflow semantics but not doing the same with unsigned types. Maybe there should be signed, unsigned and modular types. However, since other languages (such as ada) do the same, maybe I am wrong, but I can&#8217;t grasp why.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

