TIP: Click on subject to list as thread! ANSI
echo: locuser
to: Niels Petersen
from: Frank Malcolm
date: 1997-01-05 08:15:16
subject: Do it yourself Virus chec

Hi, Niels.

NP>  > NP>  > I've just coded the 32-bit CRC which PKZIP &
ARJ use in (very)
NP> fast
NP>  > asm.
NP>  > NP>  > It takes 6 minutes to calc the CRC of 5400 files,
220meg, on one
NP> of
NP>  > my
NP>  > NP>  > hard disks. Also did the 16-bit CRC which all the
other archivers
NP>  > use,
NP>  > NP>  > and the 16-bit XModem one but didn't bother timing those.

NP>  > NP> Is the 32 bit code available in an EXE file ??????

NP>  > Not today, but tell me what you want and it'll be easy to make it so -
NP>  > like, what do you want me to do with the generated CRC?

NP>  > Write it on the screen?

NP> If the program is as fast as you say, then what's the use :-)
NP> It would only slow the program down.

Yes, but I didn't know what you wanted - which you explain below.

NP> > Append it to the .EXE file unless there's already one there in
NP> > which case report changes? I can do that, in fact I'd like to.

NP> Not really an option as there are quite a few programs that will object
NP> violently to this sort of tampering. Nortons being one example.

NP> I have a program that does a CRC and stores it in the time field of the
NP> files directory entry, but this causes problems as well.

Yes, I don't like fiddling with the time field, too much relies on
knowing when files were created, like restores from backup.

NP>  > FYI I'm trying to build something which I've wanted for a long time, to
NP>  > report *any* changes to files (fuck the archive flag, useless), to
NP>  > maintain a complete catalogue of all my files (spread across several
NP>  > computers physically separated (by miles)), mirror certain directories
NP>  > between those physically separate machines,

NP> Two of the networked machines I mirror certain directories on, and have don
NP> that using BAT files. The BAT files on each machine are identical (they eve
NP> mirror themsleves) thus allowing them to initiated from either machine and
NP> mirror in both directions in the one operation.

This is very similar to what I need, except the machines are not
networked. One is at work, the other at home and I need to keep a lot of
stuff like my client etc databases in synch between the two. At the
moment I just use a BAT file and PKZIP to put all the files in the
relevant directories onto floppies (3) each morning and each night. I
don't want to do just the changed files (by recognising and resetting
the archive bit) in case for some reason I don't do the restore at the
other end that day - I'd lose knowledge of the changes.

NP>  > include the contents of
NP>  > archives such as ZIP files in all of the above,

NP>  Looking inside the ZIPs is not a consideration for me. Just so long as i
NP> know that the contents of the zip have altered.

It's not for me either for the straight mirroring exercise, but I want
to build this thing into a total file catalogue of all my files anywhere
- the work machine, my space on the work network drive, home machine(s),
backup/archive tapes, etc.

NP>  > etc, etc. It's a most
NP>  > interesting exercise

NP> Certainly is!
NP> I am only partway down the path, but it became obvious I _had_ to do
NP> something when the drive letters in use reached 24 :-)

Twenty-four! How much space have you got?! I'm up to drive N: at the
moment at home - A: & B: floppies, C: primary partition on the 2.1G
drive, D: 212M drive, E: - M: secondary on the 2.1G, N: CD-ROM. But that
will increase when I put the network in here.

NP>  > and means I have to develop *fast* disk I/O,
NP>  > *fast* CRC routines, *fast* directory sorting, *fast* archive
NP>  > extraction, etc - and then put it all together. :-)

NP> As far as the fast routines go, I will upload to TML with this message a
NP> winprogram called DIGSIG.ZIP
NP> It contains the complete description and formulas for 3 different methods o
NP> generating a digital fingerprint of a file that you should find extremely
NP> useful.

NP>  > And I'd quite like the opportunity to put some of what I've done to the
NP>  > test by other users - but I need to know what you want.

NP> I'll test it if you write it, no problems !!!

NP> What I want.......

NP> 1.
NP> Have the program search the whole drive (including all subdirectories to al
NP> depths) but only generate CRC's for the files that have an extension that
NP> matches one in a list of extensions held in (SCAN.LST) a text editable
NP> file.
NP>  Like...
NP> EXE
NP> COM
NP> BIN
NP> OVL
NP> etc
NP> (the program would default to *.* if SCAN.LST does not exists.)

Yep, my routines allow specifiable extensions or "all files".

NP> 2.
NP> Have the program take command line parameters of...
NP> Drive to scan      (1 to 10 bytes)
NP> Drive and directory where the applicable LST, DAT & LOG files exist.
NP> Name of DAT file to use.
NP> Name of LOG file to use.

Will be a simple main program wrapper calling my routines.

NP> 3
NP> Check for existence of the DAT file specified in 2
NP>                     in the location specified in 2
NP> If the DAT file does not exists, then create the file.
NP> If the DAT file exists then the data it contains is to be used as the
NP> benchmark for the current scan

Ditto.

NP> 4.
NP> Store in the DAT file the following information.
NP> Full Drive, Path and filename
NP> File size in bytes
NP> CRC or digital fingerprint
NP> (repeated for each file scanned)

Ditto, my routines return/calculate that info.

NP> 5
NP> Check for the existence of the LOG file specified in 2 in the location
NP> specified in 2
NP> If the LOG file does not exist then create it.
NP> If the LOG file does exist then _append_ to the LOG file.

Wrapper.

NP> 6
NP> Store in the LOG file the following information...

NP> Date and time of scan                                *****
NP> Volume label of drive scanned                        *****
NP> Number of files processed
NP> Any differences in size or CRC since the last scan
NP> Path and filename, size and CRC of any files added since the last scan
NP> Seperator line ------------------------------         *****

Wrapper.

NP> I can handle from the BAT the items marked with ***** if you don't wish to
NP> include them

NP> 7.
NP> When there is a CRC or filesize difference the program should halt, display
NP> on screen the full path and filename, the old and new sizes / the old and
NP> new CRC's and request a reply as to whether the NEW size or CRC should be
NP> written to the DAT file or to allow the old data to remain.
NP> A simple "Update the DAT file with the new data ? Y/N "
should suffice.
NP> (It should NOT halt simply because of a new file existing on the drive)

Halt, or write/append to a log/error file? You might want to run this
unattended.

NP> 8.
NP> Indication on screen as to which drive is being scanned and some sort of
NP> indication of what position the process is in towards completion.

Hmmm.

NP> The above is what I am already doing with my EXE.
NP> I have the DAT & LOG files stored as hidden/system in the root
directory of
NP> the drive scanned, and I am using different named DAT & LOG files on any
NP> given drive depending on whether the scan is being performed on a local
NP> drive or a networked drive. This is essential as the Pathname to any file i
NP> different when it is accessed across the LAN  (Even the LAN path name could
NP> be different depending on which machine the scan is initiated from)

I'll have to think about/experiment with LAN implications.

NP> For speed gain, and less HD head movement,(very significant) I place the DA
NP> & LOG files in a working directory on a RAMDRIVE and copy them back to the
NP> correct location on completion.

NP> It is the _same_ BAT file on all machines and it is aware of which machine
NP> it is being called from. It will alter the Ram drive letter and also the
NP> parameters it feeds the EXE accordingly

All my disks/partitions have a volume label and a zero-length file in
the root directory to identify them - this one is SYSTEM.04C, for
example.

NP> Other things that I will handle from within the calling BAT file...

NP> The DAT & LOG files are System/Hidden but will not be by the time your
NP> program has to access them

NP> The viewing of the log file at the completion of a scan

NP> Trimming of the LOG file if it is above a given limit.

NP> What else do you need apart from time to write it ????? :-)

Er, nothing except that. :-)

At the time you wrote this my routines would have provided very close to
what you want, with a suitable simple calling program. Since then I've
temporarily destroyed some of the logic while I work on some of the more
esoteric things which you aren't interested in. The ability to specify a
starting "directory" which may not be a real directory but say an
archive file within a directory within another archive file etc etc for
example has been quite a complex exercise. Which I've nearly completed.
Then I'll put the other logic back and have a look at your requirements
again.

Regards, fIM.

 * * With consequences,  the unexpected always predominate.
@EOT:

---
* Origin: Pedants Inc. (3:711/934.24)
SEEN-BY: 711/934 712/610
@PATH: 711/934

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.