(Continued from previous message)
but that'll cause problems with computers with slow drives or no virtual
memory in the OS or compiler. A 'reasonable' solution on a non-cached
computer might thrash the cache of a more modern computer, causing it to
run at 1/10th the speed it should. Every optimization you make, of any
sort, has a price that somebody else will end up paying. Nothing is
'free', even optimizations.
There are quite a few _algorithmic_ things you can deal with, but low
level things, such your *256 vs. <<8 example, are areas that should be
left to the compiler. If going to/from *256 / <<8 is appropriate, then
they'll do it. Strength reduction is a classic optimization that's been
around since the Fortran was new! Generating efficient code is their
job. They are generally fairly decent at it and 'interference' from you
can cause sub-optimal performance for somebody else. The code they
themselves generate is of course only 'generic' code and may not be
optimal for all processors (386/486/586/686 etc.), but it will probably
run okay. Carefully tuning the source for your platform though is
likely to cause worse results for others.
Playing with numerous possible optimization switches is fine. (Within
limits, or you may tune it so much to your particular computer than it
may run very poorly on some other configuation.) That can be done on
any compiler. But low level optimizations can cause problems and
performance penalties for everybody else. What is optimal code on your
computer can result in very poor executable code and run time on
somebody elses. If your program is not going to ever be ported to
another system, or even run on somebody elses system, just your own 1
single computer, then fine, feel free to optimize your heart out.
To put it all simply, there are several types of optimizations you can
make, and what you can get away with depends on the situation.
1) If the program is only your program and will not be run by anyone
else, etc. then all is fair. Feel free to make whatever changes you
want.
2) If you are distributing source to a limited group (such as PC with a
25mhz 386/DX with a brand x video card and brand y drive) then all
command line switch optimizations are fair because everybody will be
compiling it themselves and will tune it themvselves. Low level
source code changes may or may not be appropriate. It will depend on
how close their system is to yours.
3) If you are distributing source to everybody, then again, compiler
command line switch optimizations are fair because everybody can do that
themselves. Low level source changes (such as *256 vs. <<8) are a
'no-no'. By making changes such as those, you are tuning the source
based on the executable for _your_ system. There are too many cases
where things that are optimal for one system is a poor choice for
another, and you can bet your kid's college fund that it'll cause
performance problems for somebody. (A decent optimizor will _already_
make changes such as *256 vs. <<8 when appropriate, though.)
4) If you are distributing executable to a limited platform (such as
requring a specific processor and amount of memory), then you control
the compiler that will be used, so you can use whatever generates the
best code for that platform. You can make some low level changes
because you can tune the executable for a specific situation. This is
very close to #1 above, except you are requiring everybody else to have
a system similar to yours.
5) You can do the same as #4 except aim towards a nice fat middle of the
field target, such as requiring a 486 with Windows (or something.) In
that case, you can use switches such as -m486 (or whatever for the
processor you are using), you can try some code optimizations, such as
being aware of caches, etc. The resulting code may run on a 386 and a
686, but since their design is so different, there will be no way to
predict in advance how fast they run.
6) You can simply distribute a nice generic executable. Aim straight
down the middle of the playing field. And accept that, for example, a
386DX may take 2 minutes, a 386SX at the same speed may take 3.5
minutes, an Intel 486/66 takes 1 minute, and Cyrix 486/66 will only take
40 seconds, and an AMB 486/66 takes 47 seconds. And that FPU
performance would be more than acceptable on one system but be totally
unacceptable on another.
Optimizations depends very very heavily on your goal, and the particular
platform the code will be run on. Some optimizations are appropriate
for some situations. Others aren't and can cause performance problems.
I can understand the idea of what you are saying, but it just isn't that
way anymore. Long gone are the days where it was best to simply turn on
all the compiler optimization switches, or where you could depend on
them to at least not _hurt_ performance. And same with low level code
changes. But it just isn't that way anymore. Between 'intelligent'
optimizors and the incredibly wide range of different families of
processors and platforms and variety of clone chips, optimizing simply
isn't simple anymore.
--- QScan/PCB v1.19b / 01-0162
---------------
* Origin: Jackalope Junction 501-785-5381 Ft Smith AR (1:3822/1)
|