You can’t beat a good compiler… right?
TweetRight! Sort of.
It is extremely well worn conventional wisdom that for your “average” programmer, it’s pretty much impossible to beat a good optimizing compiler at writing fast code. For those who skip the “understanding” part, and jump straight into the “application” part, this rule of thumb tends to be grossly misunderstood, and misused. For these people, compilers are so smart, and so good, that micro-optimization is basically impossible (uh, no), or too hard (no), or even that they make assembly knowledge unnecessary or obsolete (not even close). This tends to lead to a mentality that optimization can be left to the compiler and often results in people warning others to not bother with micro-optimizations at all.
Optimization, today, is largely about writing some naive code, looking at the assembly output, figuring out what you want to improve on, and then getting (or, often, tricking) the compiler into doing what you want. After several iterations of this, you end up with the code you want, and you gain all the benefits of having a compiler in the loop (portability, future-proofing, and so on). This process, though, obviously involves a great deal of low-level knowledge: assembly, cpu architecture, micro-optimizations that are likely to work, etc.
As an aside, I have a question: how exactly does one go about learning how to go beyond standard C into hardcore optimization? For me, it’s been a long road of trial & error, and random discovery. It seems to me that there is probably good resources out there for getting started on optimizing code (let’s say in gcc). Know any?
All the code I used can be found here: http://github.com/lbrandy/simple-optimization-test (see signal_blur.c). I ran all of these tests on Linux using gcc 4.3.3 and -O3. UPDATED: I changed the code to use c99′s restrict keyword, instead of gcc’s __restrict__
Getting yourself into trouble
Here is a naive (and contrived) one dimensional filter routine.
#define LENGTH 5000
// a bad impersonation of a gaussian filter
const float filter[] = {0.01f, 0.2f, 0.58f, 0.2f, 0.01f};
void naive(float* in, float* out)
{
int i,j;
for (i=0;i<LENGTH-4;i++)
{
out[i]=0.0f;
for (j = 0;j<5;j++)
out[i] += filter[j] * in[i+j];
}
}
This is an interesting example because it contains a fatal flaw that might not be obvious. Take a second to understand what this code is doing. It’s fairly simple, but critical to the rest of the discussion. Let’s look at the assembly output of this, built using -O3.
08048430: 8048430: 55 push %ebp 8048431: 31 c0 xor %eax,%eax 8048433: 89 e5 mov %esp,%ebp 8048435: 8b 4d 08 mov 0x8(%ebp),%ecx 8048438: 8b 55 0c mov 0xc(%ebp),%edx 804843b: 90 nop 804843c: 8d 74 26 00 lea 0x0(%esi),%esi 8048440: d9 ee fldz 8048442: d9 14 82 fsts (%edx,%eax,4) 8048445: d9 05 10 89 04 08 flds 0x8048910 804844b: d9 04 81 flds (%ecx,%eax,4) 804844e: d8 c9 fmul %st(1),%st 8048450: de c2 faddp %st,%st(2) 8048452: d9 c9 fxch %st(1) 8048454: d9 14 82 fsts (%edx,%eax,4) 8048457: d9 05 14 89 04 08 flds 0x8048914 804845d: d9 44 81 04 flds 0x4(%ecx,%eax,4) 8048461: d8 c9 fmul %st(1),%st 8048463: de c2 faddp %st,%st(2) 8048465: d9 c9 fxch %st(1) 8048467: d9 14 82 fsts (%edx,%eax,4) 804846a: d9 05 18 89 04 08 flds 0x8048918 8048470: d8 4c 81 08 fmuls 0x8(%ecx,%eax,4) 8048474: de c1 faddp %st,%st(1) 8048476: d9 14 82 fsts (%edx,%eax,4) 8048479: d9 44 81 0c flds 0xc(%ecx,%eax,4) 804847d: de ca fmulp %st,%st(2) 804847f: de c1 faddp %st,%st(1) 8048481: d9 14 82 fsts (%edx,%eax,4) 8048484: d9 c9 fxch %st(1) 8048486: d8 4c 81 10 fmuls 0x10(%ecx,%eax,4) 804848a: de c1 faddp %st,%st(1) 804848c: d9 1c 82 fstps (%edx,%eax,4) 804848f: 83 c0 01 add $0x1,%eax 8048492: 3d 84 13 00 00 cmp $0x1384,%eax 8048497: 75 a7 jne 8048440 8048499: 5d pop %ebp 804849a: c3 ret 804849b: 90 nop 804849c: 8d 74 26 00 lea 0x0(%esi),%esi
From the assembly, we see that the compiler has done quite a bit (if you don’t know assembly, don’t worry, I’ll do my best to explain). First we see all of the address calculation involved in loading the filter coefficients has vanished (it is loading each one directly, e.g. flds 0x8048918). Second, notice that the (fixed sized) inner loop has been completely unrolled, resulting in each of the 5 multiplications and additions per iteration. So far so good.
There is, however, a very alarming surprise is this code. That is the quantity of store instructions (also loads). After every iteration of our inner loop (each filter coefficient), the result is stored. You can see 5 different store instructions (fstps, fsts) per iteration of this loop. Why? Let’s have a look at the code again:
void naive(float* in, float* out)
{
int i,j;
for (i=0;i<LENGTH-4;i++)
{
out[i]=0.0f;
for (j = 0;j<5;j++)
out[i] += filter[j] * in[i+j];
}
}
To the inexperienced, it might be bewildering why the optimizing compiler would generate 5 store instructions for out[i]= in the inner loop. Why wouldn’t it just accumulate the answer in a register, and then store only the final result? The answer is: aliasing. The problem here is that compiler cannot assume that the pointers *in and *out are disjoint. It must store the result into out[i] each iteration because out[i] may be in[i+j] in the next iteration of the inner loop. With a bit of thought, it becomes clear how this code requires these stores to be correct in cases like *out pointing one float ahead of *in.
Another hard-learned tidbit: wasteful store instructions are terrible because stores can be incredibly expensive (far worse than an extra add or multiply).
Fixing it with restricted pointers
There are several ways to fix this problem, but in the spirit of teaching the art of optimization, I’ll go with the use of the __restrict__ qualifier (this is a gcc directive, but most compilers have some support for restricted pointers). [note from comments: restricted pointers are part of the C99 standard now using the keyword restrict. ]. The only change I made is to add __restrict__ to the function declaration:
void naive_restrict(float *__restrict__ in, float *__restrict__ out)
{
int i,j;
for (i=0;i<LENGTH-4;i++)
{
out[i]=0.0f;
for (j = 0;j<5;j++)
out[i] += filter[j] * in[i+j];
}
}
This directive tells the compiler that you, the programmer, promise that *in and *out are disjoint, and no aliasing will occur. If you break your promise, don’t expect your code to be correct. Here is the assembly of that output:
080484a0: 80484a0: 55 push %ebp 80484a1: 31 c0 xor %eax,%eax 80484a3: 89 e5 mov %esp,%ebp 80484a5: 8b 55 08 mov 0x8(%ebp),%edx 80484a8: 8b 4d 0c mov 0xc(%ebp),%ecx 80484ab: 90 nop 80484ac: 8d 74 26 00 lea 0x0(%esi),%esi 80484b0: d9 05 10 89 04 08 flds 0x8048910 80484b6: d9 04 82 flds (%edx,%eax,4) 80484b9: d8 c9 fmul %st(1),%st 80484bb: d8 05 0c 89 04 08 fadds 0x804890c 80484c1: d9 05 14 89 04 08 flds 0x8048914 80484c7: d9 44 82 04 flds 0x4(%edx,%eax,4) 80484cb: d8 c9 fmul %st(1),%st 80484cd: de c2 faddp %st,%st(2) 80484cf: d9 05 18 89 04 08 flds 0x8048918 80484d5: d8 4c 82 08 fmuls 0x8(%edx,%eax,4) 80484d9: de c2 faddp %st,%st(2) 80484db: d8 4c 82 0c fmuls 0xc(%edx,%eax,4) 80484df: de c1 faddp %st,%st(1) 80484e1: d9 c9 fxch %st(1) 80484e3: d8 4c 82 10 fmuls 0x10(%edx,%eax,4) 80484e7: de c1 faddp %st,%st(1) 80484e9: d9 1c 81 fstps (%ecx,%eax,4) 80484ec: 83 c0 01 add $0x1,%eax 80484ef: 3d 84 13 00 00 cmp $0x1384,%eax 80484f4: 75 ba jne 80484b0 80484f6: 5d pop %ebp 80484f7: c3 ret 80484f8: 90 nop 80484f9: 8d b4 26 00 00 00 00 lea 0x0(%esi),%esi
Now, you do not have to be an assembly expert to see how much more streamlined this code is. It consists almost exclusively of loads, multiplies, and adds with a final store at the end. This does exactly what we’d originally hoped. It realizes it can keep a temporary running sum and only store once at the end. We should also note that if you violate the restricted pointer promise, this code will not be correct!
It shouldn’t surprise you to see how much faster this version is, either:
naive: 0.672957 naive_restrict: 0.160432
It’s almost 5 times faster than the non-restricted version.
Going Forward
Though I haven’t actually done it, my overwhelming suspicion is that this code still has a ways to go in terms of speed. The next step would be the use of SSE instructions. Maybe next post.
The take-away from examples like this should be that optimization cannot be “left” to the optimizing compiler. You have to know what you are doing to get a compiler to make fast code. And if you really know what you are doing (which means knowing assembly and the underlying architecture quite well), you can usually get a modern optimizing compiler to do exactly what you want. You have to be in the loop. Truly optimized C code ends up looking nothing like C at all.
With careful hand holding, you can get a compiler to make fast code. In that case, it can become difficult to beat a compiler with hand-optimized code. But that is not because the compiler is so good, but because you are so good at getting the compiler to make the code you want.
Tweet
twitter
June 7th, 2010 at 8:16 am
They actually tried to introduce a keyword into standard C to deal with things like pointer aliasing. But it never made it into the standard. See http://www.lysator.liu.se/c/dmr-on-noalias.html for Dennis Ritchie’s thoughts on the proposed “noalias” keyword.
June 7th, 2010 at 9:53 am
The “restrict” keyword is in C99; #define-ing it away for old compilers makes sense, but there’s no need to use the (obviously unportable) “__restrict__” extension.
On the other hand, “naive” unrolling would work, too: out[i] = filter[0] * in[i] + … + filter[4] * in[i + 4] should generate good code (not tested, sorry!)
Not that it matters to the present discussion, but your “naive implementation” has an off-by-one error, accessing in[LENGTH].
But this is just nitpicking: nice article!
June 7th, 2010 at 9:56 am
restrict is a valid C pointer qualifier and works.
paragraph 6.7.3.1 in the C99 Standard.
It is not needed to use the __restrict__ , just use restrict.
June 7th, 2010 at 10:14 am
This totally, completely, and utterly misses the point of compiler optimization, and optimization in general.
First, yes, per current state-of-the-art, a master-level human assembler can assist the compiler to produce better, safer, faster code in these cases. However, who cares? Seriously, you are talking about, what, 0.6 second. It’s not worth the effort! Oh, did you even realize that you are burning time on 10 lines of code? Seriously, a toy program has thousands of lines of C code already. Good luck trying to optimize every single line. And it’s not like you can get out 0.6 sec every time. Maybe 0.01 next time. Remember, with a 3GHz CPU, 1% of a second is a LONG time.
Oh, and did anyone tell you that mucking with non-standard stuffs are bound to disaster? And micro optimization may not work with future things? For example, multithreaded. Try really hard to minimize the IO of a single thread, but if the OS swaps out your process, destroy your cache, than the cost of swapping is so huge that your little optimization is not even worth mentioning.
Meanwhile, why don’t you invest time in something with higher return. For example, data structures that minimize cache hits and can be used over and over. Or, algorithms that are magnitudes faster than your current ones? Or, a better programming paradigm? Or, JIT with selected compilation? Structural stuffs, not little tiny details.
For some reasons, this reminds me of Vietnam war. America had all details right. They had bigger guns, bigger bombs, bigger airplanes, bigger budget; they won just about anything that they mucked into. And they got kicked out. Micro optimization, you think?
June 7th, 2010 at 10:19 am
It’s not that optimizations like these are too hard, it’s that for the vast majority of applications they’re completely unnecessary. There are very few problem domains where dropping down to assembly is worth the time it takes to optimize, let alone the cost of readability and maintainability. If you’re one of those few engineers working on a kernel or some other piece of performance-critical system code, then more power to you. But suggesting that knowledge of assembly is necessary for the truly average programmer is ridiculous.
That said, props on restrict – I’d never seen that before.
June 7th, 2010 at 10:19 am
@magice, thanks for the predictable response.
Nowhere have I argued that you should be optimizing “all” of your code, or optimizing indiscriminately (or prematurely), or without exhausting other, better options. Nowhere have I said this type of optimization is preferable to cache optimizations, or other high level concerns. Knowing -when- to bust out micro-optimizations is, in and of itself, a skill worth discussing.
However, if you are actually arguing that it is never correct to spend time doing micro-optimization, you are wrong. It’s just that simple.
June 7th, 2010 at 10:27 am
As pointed out to me at http://news.ycombinator.com/item?id=1410755, the off-by-one error is nonsense. Sorry!
June 7th, 2010 at 10:53 am
Very nice article, thanks.
Looking forward to the next post.
June 7th, 2010 at 11:11 am
0.672957 / 0.160432 is about 4.2 — quite a bit short of “almost 5″. I nitpick here because numbers really matter when optimizing. Too many a time I’ve made similar leaps thinking that I had previously obtained “nearly 5 times speedup” and looked down upon something that was 4.5 times faster. Only to find out that the “nearly 5″ was actually slower because I wasn’t being careful.
June 7th, 2010 at 11:56 am
Lots of resources on optimizing here : http://www.agner.org/optimize/
June 7th, 2010 at 11:56 am
Nice article! Out of curiosity, I tried adding -funroll-loops, which made the naive function run faster than the naive_restrict function.
Before adding -funroll-loops:
naive: 0.461494
naive_restrict: 0.143920
sum_variable: 0.105645
After adding -funroll-loops:
naive: 0.107970
naive_restrict: 0.125535
sum_variable: 0.096111
Looking forward to an SSE post.
June 7th, 2010 at 12:21 pm
Funny, I couldn’t recreate your results with gcc 4.3.3 on my machine. The O3 code I got used SSE, fully unrolled both loops, and did the elimination of the unnecessary stores. It also inlined naive() into main().
I imagine the problem is that the assembly code you looked at was for just the naive() function itself. If you call that function from somewhere, then GCC will use alias analysis to detect that the pointers don’t alias, and probably inline, obviating the need for the restrict keyword. Granted, GCC doesn’t have the best alias analysis in the world, but I’d be shocked if this example is one that breaks it.
June 7th, 2010 at 12:38 pm
You can get the same output (if you really don’t have overlapping pointers) with a simple temporal variable
float temp= 0.0f;
for (j = 0;j<5;j++)
temp += filter[j] * in[i+j];
out[i] = temp;
you clearly tell the compiler what to do, with this anti-optimization (I declared more variables than needed)
June 7th, 2010 at 12:40 pm
When you look at the cases where hand optimization can beat the compiler it normally comes down to the human knowing things the compiler can’t know. In this case you can inform the compiler but you can’t always.
June 7th, 2010 at 12:49 pm
@nate,
Did you run the code as I have it on github, or just copy/paste this into your own file? I ask because I just double checked everything, and am not seeing the same things as you. It would seem, to me, that the function pointers as I have it setup keep gcc from inlining it. I’ve been running gcc 4.3.3 on a relatively clean ubuntu 9.04 box.
Is it generating vectorized SSE, or just using the SSE stack for scalar fp operations? I can’t really get gcc to autovectorize this code at all, by playing with flags.
I can, though, get decent speedups by using ‘-O3 -msse3 -mfpmath=sse’ which will force it to generate sse scalar instructions instead of fpu instructions.
June 7th, 2010 at 12:57 pm
@magice
To thee I quote my professor from back in my undergraduate days:
“Hardware exists to provide scalability and high availability, not to cater to poorly optimized code.”
That being said, what @louis said is a perfectly acceptable compromise. Sure application developers will always want to be lazy, so we shall cut them some slack and allow them to optimize just the critical segments of code.
Remember, just as you as an application developer want to be lazy and finish your work quickly so you can go out and play, the people using that shitty-end-result of an application also want to do the same.
P.S. when I say “you” I mean the general masses of application developers who have no sense of pride in their work.
June 7th, 2010 at 1:09 pm
The reason so many program runs slow and hog memory is because ignorant people quit worrying about optimizing code. They just assume it is a non-issue with a 3GHz processor, but they are wrong.
The idiots don’t get it. LENGTH of 5000 might be trivial, but what if you had to do calculations on LENGTH of 100 MILLION or many different number tables, then savings would add up to be very large amounts of time.
June 7th, 2010 at 1:32 pm
Great article! It’s not often that you see someone brave enough to wander into the dark depths of assembly programming. To those of you who respond with frustration and anger: I can only assume that you have a hard time reading/writing assembly.
Saying that micro-optimizations aren’t useful is just silly. Most of you should be aware that some functions are called repeatedly, and those “2 seconds saved” per run may amount to a lot of time.
Also, measuring the need for optimization based on the number of lines of code, indicates a terrible lack of understanding.
June 7th, 2010 at 2:03 pm
@Enlightenment: With LENGTH=10^8, you would be memory bandwidth limited.
June 7th, 2010 at 2:05 pm
Being someone who spends most of my working day optimizing code, I have to agree that the idea that the compiler can optimize better than a human is a limited concept.
In the author’s example, we have a perfect case where the compiler doesn’t have enough information to perform specific optimizations.
In other cases, the code is just too complex for the compiler to properly deconstruct and optimize. This was a big problem with some compiler when dealing with C++ templates.
That being said, I still find that most optimizations are higher level optimizations. For example, using proper containers or data structures. Repeating work is usually a big cause of problems.
Good article.
June 7th, 2010 at 2:05 pm
I was kinda suspicious of your results so I tried your code on cygwin using actual timed tests. I bumped the LENGTH up to 100*1000*1000 to give it something to chew on, and checked the assembler to make sure it replicated your results (which it did). Using time ./a.exe. The profile time fluctuates from 0.577s to 0.655s on the fast version and from 0.545s to 0.670s on the slow version. Frankly, the error seems higher than the effect of the optimization.
I may be wrong, but I suspect the memory timings are all important here. The most expensive operation is loading a chunk of data from main memory into the cache. The additional stores the you managed to optimize out only use the local cache so they are very fast and are effectively hidden by the load from main memory. The fluctuation in timings is caused by the cache getting flushed by other apps in system.
Feel free to criticize my results but I suspect your optimization would only work on older hardware and embedded systems that aren’t so dependent on memory.
I would be interested to see your own results if you tried to time the optimizations. I personally believe that your article is fundamentally flawed because you didn’t profile your results.
June 7th, 2010 at 2:15 pm
Thanks for the good article on basic optimization.
@magice – Listen to the responses on this article – you could learn a thing or two.
As a video game programmer back in the early 90′s, I used to manually unroll inner loops and convert them to assembly by hand. Done correctly, it can change the video performance from 5-7fps to 30+ fps.
The entire program does not need to be optimized, just the slowest, most-run code. That’s what a profiler is for.
-PMF
June 7th, 2010 at 2:16 pm
I apologize – I didn’t see the timings at the end of the article.
June 7th, 2010 at 2:20 pm
@Fenris: That isn’t really an “anti-optimization” per se, as that variable would be put on the stack, which doesn’t really add extra memory (since it doesn’t call any functions within). Unless this call was very deep in the stack, I doubt it would affect anything.
June 7th, 2010 at 2:42 pm
Thanks for the article.
I come from an embedded background where I have found bad C compilers producing painfully bad code.
Sheeple are repeatedly quoting ‘Gotos Bad’, ‘Assembly Optimization Bad’, wondering what the next mantra will be…
June 7th, 2010 at 3:38 pm
A good introductory-level textbook that sets the context for this topic is Computer Systems: A Programmer’s Perspective, Bryant & O’Halloran. Uses X86, Linux (with appropriate Windows asides), and GCC, covers 32-bit and 64-bit architectures. It doesn’t dive into detail on hard-core optimizations, but it does address memory aliasing and other compiler optimization blockers, memory caches, etc. Written for undergraduate students who are new to both C and X86 assembly.
http://csapp.cs.cmu.edu/
June 7th, 2010 at 3:56 pm
I see nothing about you trying to avoid stalling the instruction pipeline in the CPU. The days of non-pipelined, CISC chips are looong gone, my friend.
June 7th, 2010 at 4:25 pm
I understand the point of your article, but it’s just plain silly not to mention that you will on average get more optimization by picking the most efficient algorithm and profiling than you ever will with premature optimization. Get it to work -right- first and then get it to work -fast- if profiling tells you it’s necessary. You should always mention this as beginners will come across your article and think “well now I have to back and optimize everything”.
June 7th, 2010 at 4:33 pm
@nate clark: note that such analysis doesn’t work if it’s not in the same source file; even with GCC’s new whole-program optimization, it won’t work if the function is in a shared library.
June 7th, 2010 at 5:17 pm
@Magice: oh really? 0.6 seconds?
The largest performance boost I’ve achieved solely from microoptimizations, from trying to “beat the compiler” is… over 800%. The application as a whole ran 8 times as fast when we were finished optimizing.
I think that’s worthwhile. It certainly was in that specific case.
Your assumption that it is not worth it to consider performance on a lower level than choice of algorithm is simply foolish.
But bad programmers will always try to make a case for why being a bad programmer, why not understanding the hardware you’re running on, or the code you’re executing, is a good thing.
Guess what. At the end of the day, they’re still bad programmers. They’re missing some vital tools in their toolbox.
A good programmer certainly knows that messing around with low-level optimizations isn’t the first thing you should do, but he *also* has the option available to him. He has the skills required to understand his code, and the performance of that code, because *sometimes* that is essential in getting good perfomance.
June 7th, 2010 at 5:42 pm
http://www.amazon.com/Computer-Architecture-Quantitative-Approach-4th/dp/0123704901 was an invaluable help in my case. It’s not about optimization, but it’s an amazing resource for understanding the CPU. And when you do that, most optimization tricks become pretty obvious.
Also look up the documentation for your target processor (both AMD and Intel have these available in PDF form on their websites)
Knowing which instructions are pipelined and which ones aren’t, and their relative latencies and all that good stuff helps too. Reading the assembly output from the compiler isn’t enough. You also have to know how expensive the instructions it chose are.
Humans certainly can beat a good compiler. And you don’t have to be a “master-level human assembler” to do it. You just have to think. Choose your battles. The compiler is unbeatable on its home turf. don’t try to optimize on the things it does well.
But in many other cases, a human programmer knows things about the code that the compiler doesn’t. And that can make a huge difference. Using “restrict” as shown in this blog post is one example. The compiler can’t safely do this because it doesn’t know if it’ll break the program. The human programmer *does* know that this assumption is safe to make.
June 7th, 2010 at 5:55 pm
this whole article could be written in five lines.
June 7th, 2010 at 6:08 pm
Rather than blame developers for failing to compensate for problematic language features, why not use a language that doesn’t have those problems in the first place (excluding kernel devs, etc)? The simple fact is that pointers eliminate many kinds of automatic optimization. Fran Allen noted this decades ago. I think it’s time to stop apologizing for C.
June 7th, 2010 at 6:26 pm
Use -fpmath=sse to use the SSE unit instead of x87 and -ftree-vectorize (might already be part of -O3) to generate vectorized SSE code.
June 7th, 2010 at 11:30 pm
yeah… go ahead … optimize…
I’m not saying that optimizing is never useful, quite the contrary. However from experience Optimizing has almost always (note.. almost) been harmful. It introduces bugs that are hard to fix because the code is harder to read, it makes the system much more fragile and looses a lot in resilience over external changes. Especially when the optimization is done through compiler tricks and exploiting obscure corner cases.
Optimization such as this should be as isolated as possible, writing the code directly in assembly language would have avoided some of the pitfalls. Burying the code in a library that is properly unit tested and provide a proper interface to it so that the optimization does not spread beyond it`s original intended use would be even better.
However, was it really necessary ? Sure we can come up with a thousand cases where optimization yielded sizeable benefits. But there are far more cases where optimization was done prematurely. No amount of optimization on a bubble sort will beat using a more efficient algorithm. But worst of it is the hidden cost which, with time becomes not so hidden.
Many times I have seen corporations with systems many millions of lines of codes that are so hard to maintain because of the systematic optimization done. In fact I have seen optimization techniques that became laws, your code HAD to do it this way. The end result was that the systematic application of this without regard of the actual impact actually made the code run MUCH slower, I’m not speaking 0.6 seconds or 8x slower I’m speaking orders of magnitude slower.
To make matters worst the code is not so convoluted and over optimized that it is nearly unmanageable. A simple feature that should take a few weeks to implement now takes over a year because of the fragility these optimization brought to the code base. Fix something here and five other things break…
YES optimization is a good thing when used properly, so is goto, firearms, insecticide and land mines. However it is because they are so easy to mis-use that we recommend that they are not be used at all. Most (99%) people are not smart enough to use them properly and in doing it make life impossible for the rest of us.
If you are to optimize then do it proper, roll up your sleeves and get the assembly code directly. but please by the love of good do not do it by tricking the compiler. If it is not smart enough for you then remove it for the bits you think you can do better and have fun but do not try to inject smart behaviour in stupid things it will come back to haunt you.
June 8th, 2010 at 2:05 am
For compilers without restrict support, I believe this would work:
void naive_tempvar(float *in, float *out)
{
int i,j;
for (i=0;i<LENGTH-4;i++)
{
float sum = 0.0f;
for (j = 0;j<5;j++)
sum += filter[j] * in[i+j];
out[i] = sum;
}
}
By fixing the location of out[i] as a temporary variable, the aliasing problem is removed, and the compiler should optimize that code into near-identical assembly.
Please correct me if I'm wrong.
June 8th, 2010 at 2:30 am
@Louis: Good article.
@Drifter: Testing such optimizations stand-alone won’t give you valid numbers. At least with N == 5000. The loading of the exe alone takes too much time.
@No-hand-optimization folks:
Understanding where and when to optimize is IMHO essential for a good programmer. And since the article does not state directly optimizing for a 3GHz-PC you really should consider the vast number of programmers working on embedded systems, which are often restricted in speed _and_ memory.
Sure: Writing assembly for todays IA32/IA64 cpus is overkill. One just cannot write code which is optimal for a Pentium 4 and a i7.
But knowing how to help the compiler to do the best job for you is the trick.
And BTW: Ever wondered why Thunderbird takes longer to start on a 3GHz machine than Netscape on a 200MHz computer ?
I suggest, every application developer must be forced to program on 500MHz computers, it’ll help the users
June 8th, 2010 at 2:31 am
Nice article, it will be interesting to see your SSE-version.
But, there’s always one, to truly optimize your code you should leave high-level code and rewrite the algorithm in assembly (but this you already knew). Trying to squeeze some extra oompf out of the code generated gives, most of the times, just a few percent — occasionally more. The drawback are that you tends to get bugged down in the generated code instead of seeing the whole algorithm and translating that.
That said, it’s not a total waste of time scurrying through the output from the compiler — when hunting for speed bugs, as your example clearly shows.
June 8th, 2010 at 2:36 am
@Graycode : Tested with PowerPC-gcc -O3: The stores are gone, but the restricted version is better unrolled using a lot multiply-add instructions. But compiling with size optimization your code gives the better result (size sometimes matters).
June 8th, 2010 at 2:50 am
Louis,
an interesting and relevant article for those involved in optimizations.
A comment to Magice: my bread and butter is real-time image processing. I can be requested to handle, say, 4 Mpixels images 60 times per second. So my time scale is the millisecond. In such a context, optimization really matters
Unfortunately for us, the behavior of modern processors becomes less and less predictable and repeatable. This slowly turns the art of programming to black magic…
June 8th, 2010 at 3:06 am
Hi,
Interesting article but missing a warning, because on real cases (I mean professional work), the use of “restrict” will certainly lead to a nice big crash
when a new developper arrives and changes the parameters, without knowing that *in and *out should not be overlapping (or do you really expect him to check all the code before changing 3 lines, or knowing what the “restrict” keyword means ?)
In fact I’ll even bet that the code writer himself will eventually break the code some months later because he would have forget about this optimization…
In my opinion the temporary variable solution mentionned by Graycode is far better because it does not introduce unnecessary risks and is finally much more clear on what the processor is really doing.
June 8th, 2010 at 3:16 am
Read a lot of comments from people who go at great depths to say: just leave it up to the compiler. Bad idea. I don’t care if you don’t check the optimization into source control; stick around at work for another 1/2 hour and optimize a few of your methods.
The reasoning is simple: as a programmer you should not trust a black box. You need to understand how the tooling you are using works; where the nuts and bolts are, how everything fits together all the way down to the metal.
Doing anything less is giving the rest of us a bad name. These people are the type who don’t know when to use an enumerator and when to use an array. These are the type of people who think that the year 2000 will never happen.
June 8th, 2010 at 4:18 am
Choosing algorithms and data structures wisely to avoid combinatorial explosion is the very best optimization of human time and safety. Micro-optimization is non-sense.
June 8th, 2010 at 4:25 am
A nice article!
I suspect the source of contention between those advocating and those not preferring micro-optimisation is due to the type of applications being developed. I do a lot of high performance computing and can assure you when you have to run heavy computation with long loops over multi-million element vectors and your application might be running hours on end, micro-optimisation can definitely be worth the effort. This of course means you would first thoroughly profile the application and have a good understanding of the bottlenecks before setting out to optimise.
June 8th, 2010 at 4:26 am
The example given in this article *isn’t* micro-optimisation. Micro-optimisation is doing things like changing
q = x[a + b] * y[a + b];intotemp = a + b; q = x[temp] + y[temp];, or changing * 2 into << 1.Adding appropriate "restrict" qualifiers isn't any kind of overengineering, either; as you've noted, it's something that allows the compiler to optimise better, without affecting the readability or maintainability of the code. In my book, correct use of the "restrict" qualifier is just as valuable and praiseworthy as correct use of the "const" qualifier.
June 8th, 2010 at 5:32 am
I suggest that you just have a poor compiler. Visual Studio 2010 has naive running on a LENGTH array in 14.1754 nanoseconds. Also, I really didn’t like how you didn’t give example inputs or other use cases, or just your general setup.
June 8th, 2010 at 8:52 am
Good article, haven’t played with C/asm optimisation for a while. I have to do a lot of optimisation in my job, in C#, and most of my stuff is limited to algorithms. The problem is in maintenance. If I can’t get the monkeys to understand what I’ve done – it never gets used again. The problem is we all have to deal with marginal coders, and we can’t just blow them off – because they ARE the majority.
But it was good to read about some real optimisation again!
June 8th, 2010 at 9:49 am
Some pretty unfair comments in what predictably turned into a flame war (be it a fairly civil one so far). The fast majority of slow response times in all applications today are from one of 3 factors 1) Blocking on I/O inefficiently (especially networked) 2) Allocating memory inefficiently (especially string construction) 3) Context/Thread switching (inefficiently?)
So my vote would be to teach all of the app developers in the world how to optimize those 3 things before we get into micro optimization religious wars. Also, when app developers don’t micro optimize their code, it’s usually not because they don’t take pride in their work, it’s more often because they have to balance features and quality against performance gains.
June 8th, 2010 at 9:50 am
And by fast majority I meant vast majority. Limited editing abilities….
June 8th, 2010 at 10:43 am
OK, the point of adding the restrict keyword is to get it to accumulate results in a temporary variable in the loop.
Why don’t you just do that?
void naive(float* in, float* out)
{
int i,j;
for (i=0;i<LENGTH-4;i++)
{
float sum = 0.0f;
for (j = 0;j<5;j++)
sum += filter[j] * in[i+j];
out[i] = sum;
}
}
This not only lets the compiler know it doesn't need to worry about aliasing, it's more readable once the expressions (as they tend to) become more complex.
June 8th, 2010 at 5:55 pm
@jon and @jalf
- Pipelining may not be so much an issue on lower end / embedded processors, where you would be looking exactly to make the kind of optimisations demonstrated in this article.
Even the Atom, if I understand it correctly, will not particularly negatively affect your expected execution order/speed from it’s in-line pipeline.
June 8th, 2010 at 10:03 pm
I like the explanation…
btw i believe this was compiled with gcc and not a MS product
June 9th, 2010 at 11:29 am
I’d like to see the same code with fixed-precision integers instead of floats — what’s the speed difference?
I know FPUs are great these days, so maybe there’s no difference. Last time I was doing similar code (90′s) it was a huge no-no to do floating point because of the performance problems. I’m curious!
June 11th, 2010 at 11:18 am
@AlainT – Kudos. I was reading this entire comment thread waiting for someone to mention exactly what you said. In a theoretical world of “Hey let’s see if I can make my code go faster / make the smallest PE possible / win the obfuscated C contest”, this is pretty neat. In the real world of someone else maintaining your code or, as you said, returning to your own code in several years with a different compiler/build environment and rusty knowledge, techniques that require specific programmer knowledge and specific compiler behavior have the potential to create some really painful and hard to find bugs.
June 19th, 2010 at 5:29 pm
fdsdfsgfd
June 28th, 2010 at 1:13 am
very helpful,useful!Sometimes ,it’s miserable that your codes are just running slowly, but u dont know why
July 23rd, 2010 at 6:59 pm
did you ever consider looking at gcc instead of looking at your code (whether C or assembly)?
more specifically do you understand how flex and bison work? do you know what they are doing (in characteristic american “micro-detail”? that’s for you magice
gcc is a mess. it’s not clean code. and this makes it extremely difficult to fix things.
and writing compilers is not easy.
someone suggested reading a book on “good code”.
perhaps you should be reading a book on compilers?
programmers are not only lazy but, today, they are also ignorant. and therein lies the problem. history has been overlooked.
knowing assembly is not enough to understand the problem. you need to understand the steps taken to get from C to assembly, and how they are implemented.
and that’s a lot of work.
note also: there are alternatives to gcc. but you need to investigate the “ancient” past to find them.
August 19th, 2010 at 3:02 pm
Great post. Good job.
November 15th, 2010 at 11:34 am
Heya. This may have been pointed out already, but the fundamental point of your post seems to be that GCC isn’t doing the best job of optimizing, but GCC isn’t really an optimizing compiler at all, even though it has some optimization options. Supposedly Intel’s compiler is the gold standard for computation-heavy optimization (at least for their procs); this is why the Gentoo streetracers like to use ICC for wherever it’s possible. I have never heard of anyone using GCC if their goal is to churn out the fastest possible code.
An interesting article, though.
May 29th, 2011 at 10:17 pm
The problem when discussing asm code improvement is always the same. Some people just don’t get the subtle difference between “never” and “almost never”.
The truth is: we should *almost never* code in assembly for speed. It *almost never* makes any sense. I *almost never* go to the doctor, because I’m almost always in good health. Fighter pilots *almost never* need to press the eject button. That doesn’t mean they never should!
October 11th, 2011 at 11:51 pm
Enhanced my skills..very well written!
November 6th, 2011 at 4:22 am
Replica Cartier Ballon Bleu Automatic Men’s Watch W69012Z4
Replica Cartier Ballon Bleu Automatic Men’s Watch W69012Z4
larger image
Units in Stock25
Bookmark and Share $357.00 $235.00
Save: 34% off
International shipping cost $27.9
Buy 2 units items or more, USA outside international shipping is free.
Shipping to All Countries in the world.
Cartier Ballon Bleu replicabirthday party invitations
November 6th, 2011 at 6:58 am
Geras Organic Olive Oil 750ml
GERAS olive oil owes its unique quality to the particular conditions prevailing in Lesvos groves and comes from Kolovi and Adramitiani olives, the varieties flourishing in the bay of Geras. Rich in antioxidants, vitamins, beneficial polyphenols.
€12.00 As low as: €10.00
olive oil sprayPregnancy Forum
November 7th, 2011 at 1:27 am
affordable real estate agent websiteProfessional Recruitment For Candidates
Our Realty solutions
are “easy” and “affordable” for Real Estate Agents
We make it easy for any agent to get onboard and focus on selling with our RealtyAgent Open Listings Management – FREE Real Estate Website.
Call 1 (866) 967-0982, Talk to a Representative and Get Started
{ Visit demo.kylarealtysolutions.com }
November 8th, 2011 at 1:02 am
About Top Gun Replica Watches
We have thousands of cheap replica watches for sale, they are best replica watches, we only keep high quality replica watches on our website, so we don’t have many hot models which on other website but low quality, our luxury top quality high end replica watches just exact like what your see on our pictures, you can buy any mens and ladies watches replicas from us, you will get pefect wrist watches, all them will make you happy.
homehome
December 31st, 2011 at 2:58 am
remote controlled toysFamosos
Descubra com qual celebridade você se parece
Você sempre se achou parecido com alguma celebridade, mas nenhum dos seus amigos ou familiares têm a mesma percepção? Então mostre para eles com quem você se parece com o…
December 31st, 2011 at 11:58 am
alarm systems for apartmentschatten
10 Home Security Systems Benefits To Keep In Mind
Home security systems are the top most requirements for safety purposes. There is a boom in this industry that has allowed home security companies to flourish and it gives many options to the customers to choose from.
January 1st, 2012 at 9:43 pm
yachtcrewκατασκευή ιστοσελίδας
Maritimworld.com is a job portal that provides something different. The reason is quite simple.
Developed and maintained by ‘First Solutions’ a Norwegian Company.
January 1st, 2012 at 9:47 pm
Welcome to Banner Stand Pros . . .
The Superstore for Banner Stands!
You’ve arrived at the best spot on the internet for banner stands. Here you’ll find the largest selection of retractable banner stands (also called roll up or pull up stands), portable models, which are the non-retractable styles, as well as models for trade shows, retail or outdoor use, accessories, replacement parts, shipping cases and more. Our huge selection means you’re sure to find exactly the right display or accessory for your needs, and our large volume means we can offer the best prices in the industry.
banner display standsAustralia Shoes
January 1st, 2012 at 10:29 pm
Please join us in enjoying the luxurious lifestyle that Naples, Florida has to offer—–beautiful tropical weather with warm breezes off the Gulf of Mexico. Fine dining, world class shopping, championship golf and boating are a part of the wonderful lifestyle that Naples has to offer.
Quail WestRYA Course
January 2nd, 2012 at 9:54 pm
bedsresort Fraser Island
Bed and Mattresses for a comfortable night sleep from Absolute Beds London
The best deal bedroom furniture store where you can find bespoke beds and mattresses exclusive beds frames at best and cheap prices in London and UK.
January 2nd, 2012 at 9:59 pm
celebrity newslocation de salle lyon
Is Sonic Youth breaking up? If so, we will need to take a break from gossiping for a few hours to weep and smash furniture.
January 3rd, 2012 at 2:02 am
Air Conditioning Service NYC
Five Borough AC Is The Air Conditioning Experts For Residential And Commercial Units. We Are One Of The Leading HVAC Service Providers In NYC.Our Respond To Your Air Conditioning Repair Needs Will Be Fast And Professional. Our Goal Is To Make Sure Your Air Conditioner Will Preform At It Best. Our Air Conditioner Repair Mechanics Are EPA Certified And Provide Top Customer Service.
air conditioners nycNAIL
January 3rd, 2012 at 8:13 am
Featured Product Freaky Cousin Outdoor Lime
$145.00
The Freaky Cousin bean bag’s vibrant colour and unique shape perfectly compliments any lounge room, childs play area or bedroom. DIMENSIONS 105 x 77 x 55 Recommended beans 220 ltrs Polyester 420 D with PVC Coating Be careful with the shocking color and shape effect of the Freaky Cousin.
outdoor bean bagswedding shower invitation
January 11th, 2012 at 11:17 am
doormats with sayingsst lucia holidays
Serve the Lord MatMates™
This Serve the Lord MatMates™ pattern welcome mat
January 13th, 2012 at 7:44 am
Saving money on ink cartridges is an easy way to lower expenses
We’re a Proud Member of the BBB ASAP Inkjets offers discount inkjet cartridges and laser toner for Epson, Canon, HP, Lexmark, Brother and many other printers at enormous savings.
inkjet print cartridgesMusic for Documentaries
February 1st, 2012 at 9:55 am
Epson M2400 Tonerpower tab files
Compatible Epson M2400 Toner
Our compatible Epson M2400 toner fits the VOSA supplied MOT printers and gives high-quality print for a lot less cost than the Epson originals.
February 1st, 2012 at 10:49 am
holidays to antiguadata cabling engineer
Antigua Weather
As one would expect, Antigua boasts a very generous climate all year round, with only a slight dip in temperature during the winter months.
February 3rd, 2012 at 9:13 am
For many years, we have been dedicated to finding the most talented, up-to-date and qualified dentists to meet your dental needs. Every dentist and office hosted on 18004SMILES.com has met these qualifications.
dentist in san diegoизготовление пластмассовых изделий
February 4th, 2012 at 11:38 am
Monkey Taxi (1 played)
Dress My Delicious (1 played)
vibratorben10 games
February 8th, 2012 at 9:06 am
como hacer un guionclaw foot bathtub faucets
This stylish free standing gooseneck faucet combo set will make a wonderful addition to your vintage bathroom. This classic set comes with everything you need to deliver water to your
$642.00
February 8th, 2012 at 9:26 am
Antigua All Inclusive Holidays and Cheap Deals
Looking for that perfect get away from it all holiday? Our holidays to Antigua include getaway holidays and cheap deals to the Indian Ocean Island of Antigua.
cursos guione-cigaret med nikotin
February 9th, 2012 at 4:40 am
Because again comply with the slim budget, air was chosen relatively inexpensive resort, which unfortunately was away from the others, but still closer to the sea.
климатици
February 10th, 2012 at 11:03 am
IT Support in London
IT support for London businesses
We aim to understand your business
Local knowledge of services
One point of call for your IT needs
Specialists in small to medium size companies
business it support londonnon profit fundraising
February 11th, 2012 at 12:04 pm
group buying in south africaResidential Extended Care Drug Rehab Los Angeles
Group-Buying
Welcome to our personal social group buying platform of services, trades, products and packages.
February 11th, 2012 at 12:44 pm
Rumahrouwverwerking
Century 21 Broker Properti Rumah Apartement Tanah Ruko Dijual Disewa Beli Cari di Indonesia
February 15th, 2012 at 3:01 am
dubai holidaypijn
Dubai Holidays & Hotels
Dubai Holidays and Tourist Information
February 15th, 2012 at 3:18 am
Suffolk County Truck Accident Attorney
Truck Accident Lawyers serving Suffolk County NY. If you are looking for Truck Accident Lawyers in Suffolk County, Foley Griffin, LLP can help.
Tractor trailer, and other commercial truck accidents can be devastating for victims and their families, but the truck accident attorneys at Foley Griffin, LLP have been helping victims of truck accidents in Suffolk County receive the compensation they deserve for years.
bottled alkaline waterTruck Accident Attorney Suffolk county ny
February 19th, 2012 at 2:08 am
Thank you for your interest in the Flying Duchess
Please fill out the form below to set up your complementary in home consultation.
Toronto Dog WalkingLiferay
February 19th, 2012 at 2:17 am
This course targets persons with no previous knowledge in Spanish who would nevertheless like to talk and interact with the local community. In this Spanish course you will study two weeks in Quito, one week in Baños (Ecuador) and go back to quito to work two weeks in a social aid programm. While working with children you will get the opportunity to practice the language
Volunteering in Ecuador and learning Spanishchristmas scavenger hunt
February 19th, 2012 at 10:29 pm
quarzhandschuhe günstig kaufenCredit Card Review
Füllung im Finger und Knöchelbereich mit Metallstaubfüllung. 4 Seperat abgenähte Kammern für gleichmäßige Verteilung und deutlich bessere Beweglichkeit der Finger. Schlagkraft bei diesem Handschuh ist extrem erhöht.
February 20th, 2012 at 9:55 pm
holiday in barbadosGalapagos Travel
Barbados All Inclusive Holidays and Cheap Deals
Looking for that perfect get away from it all holiday? Our holidays to Barbados include getaway holidays and cheap deals to the Caribbean Island of Barbados.
February 21st, 2012 at 9:38 am
It did not matter that you are a bar to the middle of nowhere, with a total freak because he could walk on two legs and was splattered on the asphalt somewhere. влагоуловители
February 22nd, 2012 at 8:08 am
Choose from 15 easyHotels and remember by booking earlier, you will usually find the cheapest prices!
hotels in centre of edinburghFree Car Insurance Rates
February 22nd, 2012 at 10:04 am
And continue to work with the vacuum, but absolutely nothing was received and the sink sat just as clogged as well come. Finally Drain decided to apply less force to increase pressure and ponatisna greater the vacuum, almost leaning against the sink. отпушване на канали
February 22nd, 2012 at 9:46 pm
Cognitive behavioral therapy (CBT) is a rather complicated term for a very simple thing – as sadly, terms in psychology often are.
In a nutshell CBT therapy is a bit like having a wise old friend who, whenever you go to see them, can always manage to get you to look at things from a different angle. Or think of something you hadn’t previously thought of.
cbt therapyleather and canvas handbag
February 24th, 2012 at 3:06 am
yacht charter balearic islandsbalers
The choice is wide, ranging from the most modern and luxurious yachts to traditional gulets, from catamarans to classic sail boats.
February 24th, 2012 at 3:12 am
cute juniors clothesWoodlands Breast Augmentation
En skraplott är en lottsedel som är täckt med ett ogenomskinlig skikt som skall skrapas bort för att avslöja om man vunnit. Vanligtvis avgörs vinst eller förlust av om man har en viss kombination av symboler eller ej. Man vinner olika mycket beroende på vilken kombination man har – ju mer osannolik en kombination är desto större är vinstsumman
February 24th, 2012 at 10:33 am
Enjoy Excellence Service And Exclusive Pricing With Atlanta’s Top Restaurants And Catering Companies!
Free Consultations Provided By Atlanta’s Top Restaurant Owners.
Bundled Deals On Photographers, Live Music, And More!
Solar Panels DundeeCatering Companies in Atlanta
February 24th, 2012 at 11:11 am
يورو دولارhotels in Sikkim
Superview Himalchuli , Gangtok
Gangtok and all places to visit is superb the superview himachuli is looking beautiful from infront
February 26th, 2012 at 9:14 pm
active directory recoverySmall Business Web Designer
Blackbird recovery for Active Directory
The only continuous recovery option for active directory, with unmatched change control and easily roll back to any previous state
February 26th, 2012 at 10:26 pm
holiday to barbadosPens, Montegrappa pens
The weather in Barbados is split into two distinct seasons. The wet season lasts from June to November, with rainfall of between 40 and 90 inches during that period.
February 27th, 2012 at 1:58 am
homehome
7010162
Gel Polish Gul
DKK 139,00
Gel Polish Gul, NYHED!!
Vejledning :
Rens dine negle, buf let. Påfør som en alm lak, hærd i 2 min i uv lampe.
Kan opløses i Acryl remover.
February 27th, 2012 at 6:19 am
The first component of the price / quality ratio is clear: maximum sustainable windows of any changes in climate over the years, this is a quality joinery. интериорни врати
February 27th, 2012 at 10:07 am
designer homes brisbaneaddress stamps
Custom Design Homes
AAD Design are award winning industry leaders in custom home design. AAD Design have won over 40 regional, state and national design awards.
February 28th, 2012 at 5:08 am
homehome
One of the most painful aspects of coping with infidelity within a relationship can be the loss of your identity. For many of us, a large part of our identity is connected to our relationship.
February 28th, 2012 at 5:11 am
homehome
Welcome to GoToTheBeach Real Estate & Vacation Rentals
Specializing in 30A Real Estate & Seagrove Beach Real Estate
February 29th, 2012 at 11:37 pm
One of the most painful aspects of coping with infidelity within a relationship can be the loss of your identity. For many of us, a large part of our identity is connected to our relationship.
March 1st, 2012 at 1:39 am
Home Business WebsiteThe weather in Barbados is split into two distinct seasons. The wet season lasts from June to November, with rainfall of between 40 and 90 inches during that period.
March 1st, 2012 at 9:36 pm
With a choice of
Low net/bulk airfares
Non-capped commissions
for First, Business & Economy class published air faires to thousands of destinations worldwide.
business class ticketAccident & Sickness Insurance
March 2nd, 2012 at 8:00 am
homehome
Coby LEDTV1926 19″ LED-LCD TV
COBY 19IN LED HD TV 720P ATSC/QAM/NTSC TUNER HDMI
ATSC – NTSC – 1366 x 768 – 1 x HDMI
In Stock
Price: $189.99
Retail: $179.99
You Save: ($10.00)
March 2nd, 2012 at 10:18 pm
franchise opportunitiesdyson hand dryers
Best Small Business Franchise Opportunities
Discover the Best Franchise and Small Business Opportunities With Management 2000
March 2nd, 2012 at 11:33 pm
iPhone App Developmentseo
What can RCM Blitz™ do for You?
Reduce unit cost of product by lowering maintenance costs and improving Overall Equipment Effectiveness (OEE)
March 3rd, 2012 at 12:39 am
ETERNAL LIFEAssay Development
A close study of the Bible will reveal mystery after mystery. WHY? Because the Bible was chiefly written to inform mankind about eternal life and how to get such (2 Tim. 3:15).
March 4th, 2012 at 8:54 pm
Mijn winkelwagen
U hebt niets in uw winkelwagen.
Producten vergelijken
Geen producten geselecteerd.
raamfolie kopenbreitling watches replica
March 5th, 2012 at 8:16 am
Court is an independent body which assists the EP and the Council of Ministers in exercising their powers to oversee the implementation of the Community budget.
Димитър Веселинов Калинов
March 5th, 2012 at 8:19 am
For this purpose, it examines reports on revenue and expenditure of all bodies. Its representatives regularly participate in meetings of parliamentary committees on budget and supervise its execution. Димитър Веселинов Калинов
March 8th, 2012 at 3:26 am
The ideas of federalism in Western Europe have a centuries-old historical tradition. In the period after World War the idea of creating a United States of Europe has organized a powerful pan-European movement. електротехник
March 9th, 2012 at 3:06 am
For example, can benefit from the EC budget provided for the development of vocational education for youth projects, projects to promote predprioemachestvoto and others. електро услуги
March 12th, 2012 at 3:47 am
Have a considerably larger amount during the year during the monsoon often cause major flooding. дърводелски услуги
March 13th, 2012 at 8:55 am
Do not allow the participation of private entrepreneurs in the following sectors: atomic energy, railways and products opomenati the special list of Department of Atomic Energy.
смяна на щрангове
March 14th, 2012 at 7:37 am
Government building strategy that provides healing to 85% of patients and detection of the disease at an early stage. вик ремонти
March 16th, 2012 at 8:43 am
non profit fundraiserRound Multicolor Toe Ballet Flat
Shop for amazing savings. Support causes that make a difference.
BiddingForGood is the intersection of charity and shopping – an auction site that allows you to shop while supporting causes that make a difference. Get great deals and experience feel-good shopping at its finest! Sign up to get the weekly scoop on secret steals and deals in your inbox.
March 20th, 2012 at 9:41 am
Industrial structure of the state gradually diversify by opening proceedings such as chemical, oil refining, fertilizers, electronics manufacturing. The largest industrial centers are Dzhhadagiya, Vagra and Saul. бойлери
March 28th, 2012 at 6:00 am
This would have disastrous consequences given that, according to the UN today 200 million people in this country are fed poor that directly threaten their lives. мобилни климатици
April 10th, 2012 at 7:25 pm
I am very enjoyed for this site. Its year informative topic. It helps me very much to solve Some Problems. It’s so fantastic and Opportunity are working style so speedy. I Think It May Be help all of you. Thanks a lot for this beauty Enjoying article with me. I am appreciating it very much! Looking Forward to Another Great article. Good luck to the Author! All the best.
April 17th, 2012 at 9:03 am
Awning with Drop shoulder is suitable for installation of storefront κλιματιστικά located on lower floor.
May 16th, 2012 at 7:23 am
To care for the garden, lawns, fountains and garden lights дограма автоклиматици