JC>Nobody needs to apologize to you at all. You are the one who needs to
JC>go back and read what's been written. The mere fact that you get one
JC>set of results with your compiler doesn't prove a thing about what
JC>anybody else is going to get with another compiler. For the sake of
JC>comparison, I ran your code through MS C. The first loop, which you
JC>assumed would be the slowest, was in fact consistently the fastest.
JC>Your "optimizations" consistently slowed the code down. In fact, with
JC>MS C, the first loop compiled to:
JC> mov eax, DWORD PTR ?bufSize@@3HA ; bufSize
JC> push edi
JC> test eax, eax
JC>; Line 15
JC> jle SHORT $L158
JC> mov esi, OFFSET FLAT:?buf2@@3PADA ; buf2
JC> mov edi, OFFSET FLAT:?buf1@@3PADA ; buf1
JC> mov ecx, eax
JC> shr ecx, 2
JC> rep movsd
JC> mov ecx, eax
JC> and ecx, 3
JC> rep movsb
JC>Note that the majority of the move is done as efficiently as a 486 can
JC>possibly do: with a `rep movsd'.
JC>By contrast, your "optimized" code produced a mess; the resulting code
JC>is over 5 times as long, and roughly 20% slower.
JC>Now, if you write only for Watcom, your "optimization" might be useful.
JC>If you want to produce good code with nearly every compiler on earth,
JC>and optimal code with most, consider using:
JC> memcpy(buf1, buf2, sizeof(bufSize));
JC>It's pretty rare that this will produce poorer code than an explicit
JC>loop; with many compilers it will do considerably better. Come to that,
JC>most decent optimizers know how to unroll loops on their own, and most
JC>produce better code for the unrolled loop than you can explicitly.
JC>Generally if you think you need to unroll a loop by hand, you really
JC>just need to learn to use your compiler.
JC>This begs the question: has Watcom's compiler _really_ gotten this much
JC>worse since I used it last? At one time, it had a perfectly good
JC>optimizer, but if your results are truly indicative of the best the
JC>compiler can do, it's gotten a LOT worse in the last several years.
JC> Later,
JC> Jerry.
lies!
You claimed to have read the thread but you obviously failed to do so.
Thank you for showing your 'vast' knowledge but if you actually bothered to
READ THE THREAD or even read the topic you would've noticed a little word
that spells 'DJGPP'.
... Beware of programmers carrying screwdrivers!
--- Ezycom V1.48g0 01fd016b
---------------
* Origin: Fox's Lair BBS Bris Aus +61-7-38033908 V34+ Node 2 (3:640/238)
|