[continued from previous message...]
While we've touched on the topic of library code, here's yet another reason
that C and C++ are particularly difficult to de-compile: macros.
For instance, if I have something like:
while (EOF != ( ch = getchar())) {
if (isupper(ch))
putchar(ch);
getchar, EOF, putchar and isupper are all typically macros, something like:
#define EOF -1
#define isupper(x) (__types[(unsigned char)x+1] && __UPPER)
#define getchar() (getc(stdin))
#define putchar(c) (putc((c),stdout)
#define getc(s) ((s)->__pos__len? \
(s)->__buf[__pos++]: \
filbuf(s))
#define putc(c,s) ((s)->__pos__len? \
(s)->__buf[__pos++]=(c): \
putbuf((s),(c)))
Finally, stdin and stdout are generally just items in an array of FILE
pointers something like:
FILE __iobuf[20];
FILE *stdin = __iobuf; // This part is done silently by the
FILE *stdout = __iobuf + 1; // compiler, without actual source code
FILE *stderr = __iobuf + 2;
Even if you just expand the macros and never actually compile the code at
all, you end up with something that's basically unreadable. However, this is
what actually gets fed to the compiler, so it's also absolute best you could
ever hope for from a perfect de-compiler.
C++ of course adds in-line functions and after an optimizer runs across
things, the code from the in-line function may well be mixed in with
surrounding code, making it nearly impossible to extract the function from
the code that calls it. There are only a few formats in use for vtables,
which would help in preserving virtual functions, but inline functions would
be lost, so you'd typically end up with hundreds of times that code would be
directly accessing variables in other classes.
Like I said, don't hold your breath. As technology improves to where
decompilers may become more feasible, optimizers and languages (C++, for
example, would be a significantly tougher language to decompile than C) also
conspire to make them less likely.
For years Unix applications have been distributed in shrouded source form
(machine but not human readable -- all comments and whitespace removed,
variables names all in the form OOIIOIOI, etc.), which has been a quite
adequate means of protecting the author's rights. It's very unlikely that
decompiler output would even be as readable as shrouded source.
A general purpose decompiler is the Holy Grail of tyro programmers.
[by Bob Stout & Jerry Coffin]
--- QM v1.00
---------------
* Origin: MicroFirm : Down to the C in chips (1:106/2000.6)
|