TIP: Click on subject to list as thread! ANSI
echo: public_domain
to: Bob Lawrence
from: Rod Speed
date: 1995-01-30 13:01:26
subject: sot/eot 1/3

RS> You really should be doing the calculation on the unarchived
RS> size, not the archived size, coz the packing ratios vary quite
RS> a bit between PKT and QWK since the common strings are different.

BL> Yair... but you tend to get a consistent average over various
BL> packets and sizes.

RS> And consistent averages arent any use when trying to detect
RS> if a message or two got lost. With a particular ratio on a
RS> particular PKT/QWK pair you never know if it was an exceptional
RS> collection of messages or some dropped.

BL> A single dropped message will show up in the converter itself,
BL> and ought to be checked there.

Nope, we are talking about an overall check on anything that might have
gone wrong, completely separate from the work being done. The best way
to do that is assume nothing, count the number of messages that go in,
compare that with the number that come out, as long as thats possible.
Nothing else gives the same level of overall check. No assumptions, you
dont give a stuff about the code being checked, whatever it does, you
can see if you are getting the expected result or not.

BL> I was more concerned about a dropped packet,

Thats just ONE possible problem. A message counter checks for
ALL possible problems. There is a theoretical possibility that
you might have a situation where you lose one message and another
gets split into two, but thats about the only thing it doesnt catch.
The effort required to cover that one just isnt warranted, tho its
possible to cover that one too.

BL> if it failed to unzip for some reason, or failed to be read
BL> if it had a fulty header, or such. Once the converter opens
BL> the packet it just keeps on truckin'. A broken message will
BL> throw it out of kilter so that it misses a null terminator,
BL> and it will fail then, and generate an error.

And thats about the LAST way you should attempt overall quality
checks, thinking about what can break and checking for that
specifically. Coz it assumes you can identify in advance what
might break. Often you cant and when an overall quality check
shows up a problem, you think, 'shit how bloody obvious'

There is no substitute for no assumptions whatever.

RS> Yeah, thats mostly the PATH and SEENBYS getting dropped,
RS> but the business of unpacked to packed size is surprisingly
RS> complex and you dont even get the same effect over all the
RS> archive formats. Any dictionary style compression has to
RS> give odd and hard to predict behaviour at times.

BL> Yair. I replace the PATH and SEENBYS with spaces, and
BL> I had lots of spaces in the QWK message front and back,
BL> because QWK pads out to 128-byte records with spaces.
BL> I trimmed them but it made no difference to the archive,
BL> which was compressing consecutive spaces rather well.

Yeah, and a more subtle effect is the stuff like the word SEENBY itself.
You might think it might be worthwhile converting that text into a single
byte say with a hi bit set to flag it as a keyword. But in practice with
an archiver that works on dictionarys as well as runs of chars, its not
likely to be worth the trouble, best to let the archiver do that size wise.

RS> Yeah, no argument that its definitely harder and
RS> slower. OTOH I dont believe the other is good enough.

BL> It's made worse by my twitter. RTL gave me the poos yesterday in
BL> AVTech, and it only took three lines of code to add a twitter to the
BL> converter, but 150 messages became 127, and it ruins all my ratios!

Sure, but thats another design questions. You can argue that its not
that desirable to do that in a PKT->QWK converter coz its not possible
to conveniently make visible what is made invisible. Probably rather
better done in the QWK reader which has that capability anyway. Then
you can untwit something if someone comments on something being
particularly weird and you want to look at it.

BL> I'm stumped. I can't count the messages in the packet archives;
BL> just the packets, and now even the sizes don't mean anything.

Yeah, if it was me I wouldnt twit in the PKT->QWK if only coz it roots any
checking possibility. I guess a message counter would be sort of still
feasible if the PKT->QWK produced a count of twitted messages too tho.

Or the counter in the PKTs used the same twit list spec. Gets very messy
for the more complex twitting tho where you dont just dump every single
message from a particular person but have it area or To/From sensitive.
Again tho, IMO if you want a fancy twitter it has to be in the reader,
not the converter. If only coz the fancier it gets the more likely you
are to want to tune it.

BL> If I open the archive to count messages, I don't include possible
BL> unzip errors.

I dont think thats that hard, just check the archivers error code return.

BL> All I can do is count packets in the unopened archives,
BL> and make sure I process the correct number of packets,

Thats certainly better than nothing but nothing like the capability
of a proper message counter to detect problems.

BL> and then count messages to make sure I write as many to QWK as
BL> I read from PKT (including twits).

RS> Fact remains tho, for a reliable safety check message counter
RS> in PKTs, you need a completely different alg, new code, the
RS> works. I'm not convinced its particularly difficult.

BL> I dont agree with that philosophy. Two stuffed algorithms
BL> are worse than one good one.

Nope, you have had a gross brain fart on that one. The chance
of two completely separate algs producing identical wrong numbers
is vanishingly small. Particularly when applied over multiple QWKs
forever. The worst risk you take is false alarms, complaints about
a problem when there isnt one.

RS> Nope, you certainly do have to actually scrutinise the unzipper
RS> error code, not just ignore it like is done now, but thats dead easy.

BL> How?

    if errorlevel nn goto xxxx

Immediately after the archiver is invoked in the BAT file. Just
watch out about the order of evaluation if you check more than one.

RS> The other big deficiency is the inconsistency, the netmail being
RS> quite different in detail to the echomail etc.

BL> Yair! That's what gives me the shits most of all. You have to
BL> actually write two different headers and footers for the message.
BL> I can't imagine why they did that, especially when it introduces
BL> a conflict over the "AREA:" line.

Its an accident of history where the original concept was netmail
and echo mail was kludged on after, hence the AREA being kludged
on coz you need one.

BL> Writing code is very like designing a circuit;

RS> Its not as similar as you might think actually.


(Continued to next message)

--- PQWK202
* Origin: afswlw rjfilepwq (3:711/934.2)
SEEN-BY: 690/718 711/809 934
@PATH: 711/934

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.