TIP: Click on subject to list as thread! ANSI
echo: linuxhelp
to: Geo
from: Tony Ingenoso
date: 2005-05-03 01:40:26
subject: Re: This is not an Anti-OSS Flame

From: "Tony Ingenoso" 

~30% faster on the benchmarks I ran against a 386SX.  The 16 bit bus CPU's
are VERY memory constrained.  Almost anything you can do to minimize the
amount of code they're fetching is a win.

(WFZ used to call'em the "castrati of computing") - the execution
unit almost always outruns the prefetcher and memory system.

Depending on the situation, this might even be faster on the DX & 486
CPU's.  The prefetcher can get something to execute with one mem cycle
rather than two.  If you're in a long unbroken sequence, the MOV might be
faster if the prefetcher can get to it, but if you just jumped to a branch
target and the prefetch was blown away, a plump instruction at the target
that takes multiple mem cycles to fetch isn't going to be a win.

On machines with L1's packing more usable instructions into your limited
cache lines is goodness.  Stuff that "books" faster, but blows
your L1 locality because of puffiness is a loser in reality.  L1 misses can
be 5X+ slower, so you can be nominally "book slower" by quite a
bit and still win handily if it keeps your L1 in better shape.

Strunk & White : "let every word tell"

"Geo"  wrote in message
news:4276d6b4$1{at}w3.nls.net...
> "Tony Ingenoso"  wrote in message
> news:427685d4$1{at}w3.nls.net...
> >
> > XOR EAX,EAX  (2 bytes)
> > INC EAX  (1 byte)
>
> So realistically, if you have some 2K loop that executes this repeatedly,
> how much faster can it be?
>
> Geo.

--- BBBS/NT v4.01 Flag-5
* Origin: Barktopia BBS Site http://HarborWebs.com:8081 (1:379/45)
SEEN-BY: 633/267 270 5030/786
@PATH: 379/45 1 106/2000 633/267

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.