TIP: Click on subject to list as thread! ANSI
echo: c_plusplus
to: DANIAL GIBSON
from: DARIN MCBRIDE
date: 1997-04-20 13:56:00
subject: Re: Video

DM>x+(y<<6)+(y<<8): 182
DM>x+y*320: 169
DM>Consistantly your shift is slower by about 6%.  This is
DM>with standard optimizations and debugging.  No fancy stuff
DM>here.  Putting optimization to -O3 obviously destroys the
DM>entire thing.  :-)
 DG> Hmmm, wierd. Destroys? How?
As in, when you put on optimizations, the compiler "recognizes" that the code 
in the loop is "useless" and eliminates it.  :-)
DM> DG> On mine, a 686 120mhz, under windows 95, the 
DM> DG> shifting method was faster
DM> DG> in every instance.
DM>Is this a Cyrix 686?
 DG> Let me check.
 DG> ...Two reboots later...
 DG> Darn non-system disk...
:-)
 DG> Yep.
Ok, I believe that Cyrix has slightly different cache optimizations than 
"normal" Intel which would result in your "skewed" results.
 DG> I guess really, so long as you don't need lots of frames per
 DG> second (ie. not more than 10 or so), it doesn't really matter what
 DG> method you use. They are both about the same. And given the speed of
 DG> today's processors, and the compilers, it doesn't matter in terms of how
 DG> smooth the video is (I mean, shifting or multiplying won't affect
 DG> it). You should just not plot too many pixels. If you plot too many
 DG> pixels then you should change the algorithim.
Therefore, what we come down to is algorithm and readability, and not actual 
speed.  So if there's no difference between (y<<6)+(y<<8) and y*320, why use 
the less readable version?  Perhaps if you were going for a placement in the 
OCCC, we could understand it.  :-)
inline void Plot(int x, int y, char colour)
{
  // since 320*y is the same as (y<<6)+(y<<8), we can do:
  video_buffer[x + (y<<6) + (y<<8)] = colour;
}
vs
inline void Plot(int x, int y, char colour)
{
  video_buffer[x + 320*y] = colour;
}
The speed is the same.  The former, however, is likely more code (i.e., 
bigger executable) and DEFINATELY is less readable.  With two points in its 
favour, we should opt for the second one.
inline void Plot(int x, int y, char colour)
{
  //video_buffer[x + 320*y] = colour;
  // the following does the same, but our benchmark shows it to be three
  // times faster:
  video_buffer[x + (y<<6) + (y<<8)] = colour;
}
If the comments were true, this would be an acceptable version.  (However, 
our benchmarks has shown this to be false, so obviously we wouldn't do it 
here.)
--- Maximus/2 3.01
---------------
* Origin: Tanktalus' Tower BBS (PVT) (1:342/708)

SOURCE: echomail via exec-pc

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.